Single-photon sensors are novel devices with extremely high single-photon sensitivity and temporal resolution.However,these advantages also make them highly susceptible to noise.Moreover,single-photon cameras face sev...Single-photon sensors are novel devices with extremely high single-photon sensitivity and temporal resolution.However,these advantages also make them highly susceptible to noise.Moreover,single-photon cameras face severe quantization as low as 1 bit/frame.These factors make it a daunting task to recover high-quality scene information from noisy single-photon data.Most current image reconstruction methods for single-photon data are mathematical approaches,which limits information utilization and algorithm performance.In this work,we propose a hybrid information enhancement model which can significantly enhance the efficiency of information utilization by leveraging attention mechanisms from both spatial and channel branches.Furthermore,we introduce a structural feature enhance module for the FFN of the transformer,which explicitly improves the model's ability to extract and enhance high-frequency structural information through two symmetric convolution branches.Additionally,we propose a single-photon data simulation pipeline based on RAW images to address the challenge of the lack of single-photon datasets.Experimental results show that the proposed method outperforms state-of-the-art methods in various noise levels and exhibits a more efficient capability for recovering high-frequency structures and extracting information.展开更多
In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a paral...In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a parallel algorithm has great advantages. However, traditional algorithms can't make full use of multi-core CPU hardware resources. In the paper, a fast parallel algorithm is presented to speed up data processing. A pipeline mode is adopted to design the parallel algorithm. And the complexity of the pipeline algorithm is analyzed theoretically. To evaluate the performance of the new algorithm, effects of threads number and layers number are investigated by a serial of experiments. The experimental results show that the threads number and layers number are two remarkable factors to the speedup ratio. The tendency of speedup versus threads number reveals a positive relationship which greatly agrees with the Amdahl's law, and the tendency of speedup versus layers number also keeps a positive relationship agreeing with Gustafson's law. The new algorithm uses topological information to compute contours with a parallel method of speedup. Another parallel algorithm based on data parallel is used in experiments to show that pipeline parallel mode is more efficient. A case study at last shows a suspending performance of the new parallel algorithm. Compared with the serial slicing algorithm, the new pipeline parallel algorithm can make full use of the multi-core CPU hardware, accelerate the slicing process, and compared with the data parallel slicing algorithm, the new slicing algorithm in this paper adopts a pipeline parallel model, and a much higher speedup ratio and efficiency is achieved.展开更多
A profound understanding of the costs to perform condition assessment on buried drinking water pipeline infrastructure is required for enhanced asset management. Toward this end, an automated and uniform method of col...A profound understanding of the costs to perform condition assessment on buried drinking water pipeline infrastructure is required for enhanced asset management. Toward this end, an automated and uniform method of collecting cost data can provide water utilities a means for viewing, understanding, interpreting and visualizing complex geographically referenced cost information to reveal data relationships, patterns and trends. However, there has been no standard data model that allows automated data collection and interoperability across platforms. The primary objective of this research is to develop a standard cost data model for drinking water pipeline condition assessment projects and to conflate disparate datasets from differing utilities. The capabilities of this model will be further demonstrated through performing trend analyses. Field mapping files will be generated from the standard data model and demonstrated in an interactive web map created using Google Maps API (application programming interface) for JavaScript that allows the user to toggle project examples and to perform regional comparisons. The aggregation of standardized data and further use in mapping applications will help in providing timely access to condition assessment cost information and resources that will lead to enhanced asset management and resource allocation for drinking water utilities.展开更多
Current methods for predicting missing values in datasets often rely on simplistic approaches such as taking median value of attributes, limiting their applicability. Real-world observations can be diverse, taking sto...Current methods for predicting missing values in datasets often rely on simplistic approaches such as taking median value of attributes, limiting their applicability. Real-world observations can be diverse, taking stock price as example, ranging from prices post-IPO to values before a company’s collapse, or instances where certain data points are missing due to stock suspension. In this paper, we propose a novel approach using Nonlinear Matrix Completion (NIMC) and Deep Matrix Completion (DIMC) to predict associations, and conduct experiment on financial data between dates and stocks. Our method leverages various types of stock observations to capture latent factors explaining the observed date-stock associations. Notably, our approach is nonlinear, making it suitable for datasets with nonlinear structures, such as the Russell 3000. Unlike traditional methods that may suffer from information loss, NIMC and DIMC maintain nearly complete information, especially in high-dimensional parameters. We compared our approach with state-of-the-art linear methods, including Inductive Matrix Completion, Nonlinear Inductive Matrix Completion, and Deep Inductive Matrix Completion. Our findings show that the nonlinear matrix completion method is particularly effective for handling nonlinear structured data, as exemplified by the Russell 3000. Additionally, we validate the information loss of the three methods across different dimensionalities.展开更多
Medical big data with artificial intelligence are vital in advancing digital medicine.However,the opaque and non-standardised nature embedded in most medical data extraction is prone to batch effects and has become a ...Medical big data with artificial intelligence are vital in advancing digital medicine.However,the opaque and non-standardised nature embedded in most medical data extraction is prone to batch effects and has become a significant obstacle to reproducing previous works.This paper aims to develop an easy-to-use time-series multimodal data extraction pipeline,Quick-MIMIC,for standardised data extraction from MIMIC datasets.Our method can fully integrate different data structures into a time-series table,including structured,semi-structured,and unstructured data.We also introduce two additional modules to Quick-MIMIC,a pipeline parallelization method and data analysis methods,for reducing the data extraction time and presenting the characteristics of the extracted data intuitively.The extensive experimental results show that our pipeline can efficiently extract the needed data from the MIMIC dataset and convert it into the correct format for further analytic tasks.展开更多
Agentic AI represents a significant advancement in artificial intelligence,enabling proactive agents that can set goals,make decisions,and adapt to changing situations.However,the performance of these systems is heavi...Agentic AI represents a significant advancement in artificial intelligence,enabling proactive agents that can set goals,make decisions,and adapt to changing situations.However,the performance of these systems is heavily dependent on the quality and relevance of the data they process.This research highlights the critical risk posed by faulty,insecure,or contextually inappropriate input data in modern Agentic AI systems.To address this challenge,this study proposes the Autonomous Data Integrity Layer(ADIL).This flexible architecture integrates best practices from security engineering and data science to ensure that Agentic AI systems operate with clean,validated,and contextually relevant data.By focusing on data integrity,ADIL enhances the reliability,accountability,and effectiveness of Agentic AI systems,leading to more trustworthy and robust intelligent agents.展开更多
Dear Editor,In the accompanying Comment,Bratchenko et al.raised two concerns about the spectral data analysis pipeline employed for the Surface-enhanced Raman scattering and Artificial Intelligence for Cancer Screenin...Dear Editor,In the accompanying Comment,Bratchenko et al.raised two concerns about the spectral data analysis pipeline employed for the Surface-enhanced Raman scattering and Artificial Intelligence for Cancer Screening(SERS-AICS)technique in our original paper:(1)inappropriate accuracy presentation and(2)the use of a single data split for model evaluation.As a promising technique for molecular fingerprinting,SERS-basedearlycancerdetection approaches using biofluids and liquid biopsy are typically evaluated base stricty on theiracuray and elaility.展开更多
Current research on Digital Twin(DT)is largely focused on the performance of built assets in their operational phases as well as on urban environment.However,Digital Twin has not been given enough attention to constru...Current research on Digital Twin(DT)is largely focused on the performance of built assets in their operational phases as well as on urban environment.However,Digital Twin has not been given enough attention to construction phases,for which this paper proposes a Digital Twin framework for the construction phase,develops a DT prototype and tests it for the use case of measuring the productivity and monitoring of earthwork operation.The DT framework and its prototype are underpinned by the principles of versatility,scalability,usability and automation to enable the DT to fulfil the requirements of large-sized earthwork projects and the dynamic nature of their operation.Cloud computing and dashboard visualisation were deployed to enable automated and repeatable data pipelines and data analytics at scale and to provide insights in near-real time.The testing of the DT prototype in a motorway project in the Northeast of England successfully demonstrated its ability to produce key insights by using the following approaches:(i)To predict equipment utilisation ratios and productivities;(ii)To detect the percentage of time spent on different tasks(i.e.,loading,hauling,dumping,returning or idling),the distance travelled by equipment over time and the speed distribution;and(iii)To visualise certain earthwork operations.展开更多
基金supported by grants from the Intramural Research Program of the NIH,NIA,NIH,Department of Health and Human Services(Grant No.ZIAAG000534)as well as the National Institute of Neurological Disorders and Stroke,the NIH,USA(Grant Nos.R01 NS120992 and U54 NS123743)to Mercedes Prudencio.
文摘This is a correction to:Ziyi Li,Cory A.Weller,Syed Shah,Nicholas L.Johnson,Ying Hao,Paige B.Jarreau,Jessica Roberts,Deyaan Guha,Colleen Bereda,Sydney Klaisner,Pedro Machado,Matteo Zanovello,Mercedes Prudencio,Bjorn Oskarsson,Nathan P.Staff,Dennis W.Dickson,Pietro Fratta,Leonard Petrucelli,Priyanka Narayan,Mark R.Cookson,Michael E.Ward,Andrew B.Singleton,Mike A.Nalls,Yue A.Qi,ProtPipe:A Multifunctional Data Analysis Pipeline for Proteomics and Peptidomics,Genomics,Proteomics Bioinformatics,Volume 22,Issue 6,December 2024,qzae083,https://doi.org/10.1093/gpbjnl/qzae083.
文摘Single-photon sensors are novel devices with extremely high single-photon sensitivity and temporal resolution.However,these advantages also make them highly susceptible to noise.Moreover,single-photon cameras face severe quantization as low as 1 bit/frame.These factors make it a daunting task to recover high-quality scene information from noisy single-photon data.Most current image reconstruction methods for single-photon data are mathematical approaches,which limits information utilization and algorithm performance.In this work,we propose a hybrid information enhancement model which can significantly enhance the efficiency of information utilization by leveraging attention mechanisms from both spatial and channel branches.Furthermore,we introduce a structural feature enhance module for the FFN of the transformer,which explicitly improves the model's ability to extract and enhance high-frequency structural information through two symmetric convolution branches.Additionally,we propose a single-photon data simulation pipeline based on RAW images to address the challenge of the lack of single-photon datasets.Experimental results show that the proposed method outperforms state-of-the-art methods in various noise levels and exhibits a more efficient capability for recovering high-frequency structures and extracting information.
文摘In Additive Manufacturing field, the current researches of data processing mainly focus on a slicing process of large STL files or complicated CAD models. To improve the efficiency and reduce the slicing time, a parallel algorithm has great advantages. However, traditional algorithms can't make full use of multi-core CPU hardware resources. In the paper, a fast parallel algorithm is presented to speed up data processing. A pipeline mode is adopted to design the parallel algorithm. And the complexity of the pipeline algorithm is analyzed theoretically. To evaluate the performance of the new algorithm, effects of threads number and layers number are investigated by a serial of experiments. The experimental results show that the threads number and layers number are two remarkable factors to the speedup ratio. The tendency of speedup versus threads number reveals a positive relationship which greatly agrees with the Amdahl's law, and the tendency of speedup versus layers number also keeps a positive relationship agreeing with Gustafson's law. The new algorithm uses topological information to compute contours with a parallel method of speedup. Another parallel algorithm based on data parallel is used in experiments to show that pipeline parallel mode is more efficient. A case study at last shows a suspending performance of the new parallel algorithm. Compared with the serial slicing algorithm, the new pipeline parallel algorithm can make full use of the multi-core CPU hardware, accelerate the slicing process, and compared with the data parallel slicing algorithm, the new slicing algorithm in this paper adopts a pipeline parallel model, and a much higher speedup ratio and efficiency is achieved.
文摘A profound understanding of the costs to perform condition assessment on buried drinking water pipeline infrastructure is required for enhanced asset management. Toward this end, an automated and uniform method of collecting cost data can provide water utilities a means for viewing, understanding, interpreting and visualizing complex geographically referenced cost information to reveal data relationships, patterns and trends. However, there has been no standard data model that allows automated data collection and interoperability across platforms. The primary objective of this research is to develop a standard cost data model for drinking water pipeline condition assessment projects and to conflate disparate datasets from differing utilities. The capabilities of this model will be further demonstrated through performing trend analyses. Field mapping files will be generated from the standard data model and demonstrated in an interactive web map created using Google Maps API (application programming interface) for JavaScript that allows the user to toggle project examples and to perform regional comparisons. The aggregation of standardized data and further use in mapping applications will help in providing timely access to condition assessment cost information and resources that will lead to enhanced asset management and resource allocation for drinking water utilities.
文摘Current methods for predicting missing values in datasets often rely on simplistic approaches such as taking median value of attributes, limiting their applicability. Real-world observations can be diverse, taking stock price as example, ranging from prices post-IPO to values before a company’s collapse, or instances where certain data points are missing due to stock suspension. In this paper, we propose a novel approach using Nonlinear Matrix Completion (NIMC) and Deep Matrix Completion (DIMC) to predict associations, and conduct experiment on financial data between dates and stocks. Our method leverages various types of stock observations to capture latent factors explaining the observed date-stock associations. Notably, our approach is nonlinear, making it suitable for datasets with nonlinear structures, such as the Russell 3000. Unlike traditional methods that may suffer from information loss, NIMC and DIMC maintain nearly complete information, especially in high-dimensional parameters. We compared our approach with state-of-the-art linear methods, including Inductive Matrix Completion, Nonlinear Inductive Matrix Completion, and Deep Inductive Matrix Completion. Our findings show that the nonlinear matrix completion method is particularly effective for handling nonlinear structured data, as exemplified by the Russell 3000. Additionally, we validate the information loss of the three methods across different dimensionalities.
基金supported by the National Natural Science Foundation of China-Science and Technology Development Fund(No.62361166662)the National Key R&D Program of China(Nos.2023YFC3503400 and 2022YFC3400400)+3 种基金the Key R&D Program of Hunan Province(Nos.2023GK2004,2023SK2059,and 2023SK2060)the Top 10 Technical Key Project in Hunan Province(No.2023GK1010)the Key Technologies R&D Program of Guangdong Province(No.2023B1111030004)the Funds of State Key Laboratory of Chemo/Biosensing and Chemometrics,the National Supercomputing Center in Changsha(http://nscc.hnu.edu.cn/),and Peng Cheng Lab.
文摘Medical big data with artificial intelligence are vital in advancing digital medicine.However,the opaque and non-standardised nature embedded in most medical data extraction is prone to batch effects and has become a significant obstacle to reproducing previous works.This paper aims to develop an easy-to-use time-series multimodal data extraction pipeline,Quick-MIMIC,for standardised data extraction from MIMIC datasets.Our method can fully integrate different data structures into a time-series table,including structured,semi-structured,and unstructured data.We also introduce two additional modules to Quick-MIMIC,a pipeline parallelization method and data analysis methods,for reducing the data extraction time and presenting the characteristics of the extracted data intuitively.The extensive experimental results show that our pipeline can efficiently extract the needed data from the MIMIC dataset and convert it into the correct format for further analytic tasks.
文摘Agentic AI represents a significant advancement in artificial intelligence,enabling proactive agents that can set goals,make decisions,and adapt to changing situations.However,the performance of these systems is heavily dependent on the quality and relevance of the data they process.This research highlights the critical risk posed by faulty,insecure,or contextually inappropriate input data in modern Agentic AI systems.To address this challenge,this study proposes the Autonomous Data Integrity Layer(ADIL).This flexible architecture integrates best practices from security engineering and data science to ensure that Agentic AI systems operate with clean,validated,and contextually relevant data.By focusing on data integrity,ADIL enhances the reliability,accountability,and effectiveness of Agentic AI systems,leading to more trustworthy and robust intelligent agents.
文摘Dear Editor,In the accompanying Comment,Bratchenko et al.raised two concerns about the spectral data analysis pipeline employed for the Surface-enhanced Raman scattering and Artificial Intelligence for Cancer Screening(SERS-AICS)technique in our original paper:(1)inappropriate accuracy presentation and(2)the use of a single data split for model evaluation.As a promising technique for molecular fingerprinting,SERS-basedearlycancerdetection approaches using biofluids and liquid biopsy are typically evaluated base stricty on theiracuray and elaility.
基金supported by UKRI Innovate UK (Project number:44584,Project Title AI-driven and real-time command and control centre for site equipment in infrastructure projectsProject number,Funding body:UKRI Innovate UK,United Kingdom)under the‘Increase productivity,performance and quality in UK construction’competition.
文摘Current research on Digital Twin(DT)is largely focused on the performance of built assets in their operational phases as well as on urban environment.However,Digital Twin has not been given enough attention to construction phases,for which this paper proposes a Digital Twin framework for the construction phase,develops a DT prototype and tests it for the use case of measuring the productivity and monitoring of earthwork operation.The DT framework and its prototype are underpinned by the principles of versatility,scalability,usability and automation to enable the DT to fulfil the requirements of large-sized earthwork projects and the dynamic nature of their operation.Cloud computing and dashboard visualisation were deployed to enable automated and repeatable data pipelines and data analytics at scale and to provide insights in near-real time.The testing of the DT prototype in a motorway project in the Northeast of England successfully demonstrated its ability to produce key insights by using the following approaches:(i)To predict equipment utilisation ratios and productivities;(ii)To detect the percentage of time spent on different tasks(i.e.,loading,hauling,dumping,returning or idling),the distance travelled by equipment over time and the speed distribution;and(iii)To visualise certain earthwork operations.