Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting...Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting flood resource variables using single or hybrid machine learning techniques.However,class-based flood predictions have rarely been investigated,which can aid in quickly diagnosing comprehensive flood characteristics and proposing targeted management strategies.This study proposed a prediction approach of flood regime metrics and event classes coupling machine learning algorithms with clustering-deduced membership degrees.Five algorithms were adopted for this exploration.Results showed that the class membership degrees accurately determined event classes with class hit rates up to 100%,compared with the four classes clustered from nine regime metrics.The nonlinear algorithms(Multiple Linear Regression,Random Forest,and least squares-Support Vector Machine)outperformed the linear techniques(Multiple Linear Regression and Stepwise Regression)in predicting flood regime metrics.The proposed approach well predicted flood event classes with average class hit rates of 66.0%-85.4%and 47.2%-76.0%in calibration and validation periods,respectively,particularly for the slow and late flood events.The predictive capability of the proposed prediction approach for flood regime metrics and classes was considerably stronger than that of hydrological modeling approach.展开更多
Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered so...Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.展开更多
Edge Machine Learning(EdgeML)and Tiny Machine Learning(TinyML)are fast-growing fields that bring machine learning to resource-constrained devices,allowing real-time data processing and decision-making at the network’...Edge Machine Learning(EdgeML)and Tiny Machine Learning(TinyML)are fast-growing fields that bring machine learning to resource-constrained devices,allowing real-time data processing and decision-making at the network’s edge.However,the complexity of model conversion techniques,diverse inference mechanisms,and varied learning strategies make designing and deploying these models challenging.Additionally,deploying TinyML models on resource-constrained hardware with specific software frameworks has broadened EdgeML’s applications across various sectors.These factors underscore the necessity for a comprehensive literature review,as current reviews do not systematically encompass the most recent findings on these topics.Consequently,it provides a comprehensive overview of state-of-the-art techniques in model conversion,inference mechanisms,learning strategies within EdgeML,and deploying these models on resource-constrained edge devices using TinyML.It identifies 90 research articles published between 2018 and 2025,categorizing them into two main areas:(1)model conversion,inference,and learning strategies in EdgeML and(2)deploying TinyML models on resource-constrained hardware using specific software frameworks.In the first category,the synthesis of selected research articles compares and critically reviews various model conversion techniques,inference mechanisms,and learning strategies.In the second category,the synthesis identifies and elaborates on major development boards,software frameworks,sensors,and algorithms used in various applications across six major sectors.As a result,this article provides valuable insights for researchers,practitioners,and developers.It assists them in choosing suitable model conversion techniques,inference mechanisms,learning strategies,hardware development boards,software frameworks,sensors,and algorithms tailored to their specific needs and applications across various sectors.展开更多
Sentiment Analysis,a significant domain within Natural Language Processing(NLP),focuses on extracting and interpreting subjective information-such as emotions,opinions,and attitudes-from textual data.With the increasi...Sentiment Analysis,a significant domain within Natural Language Processing(NLP),focuses on extracting and interpreting subjective information-such as emotions,opinions,and attitudes-from textual data.With the increasing volume of user-generated content on social media and digital platforms,sentiment analysis has become essential for deriving actionable insights across various sectors.This study presents a systematic literature review of sentiment analysis methodologies,encompassing traditional machine learning algorithms,lexicon-based approaches,and recent advancements in deep learning techniques.The review follows a structured protocol comprising three phases:planning,execution,and analysis/reporting.During the execution phase,67 peer-reviewed articles were initially retrieved,with 25 meeting predefined inclusion and exclusion criteria.The analysis phase involved a detailed examination of each study’s methodology,experimental setup,and key contributions.Among the deep learning models evaluated,Long Short-Term Memory(LSTM)networks were identified as the most frequently adopted architecture for sentiment classification tasks.This review highlights current trends,technical challenges,and emerging opportunities in the field,providing valuable guidance for future research and development in applications such as market analysis,public health monitoring,financial forecasting,and crisis management.展开更多
Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of...Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of advanced metering infrastructure(AMI)and Smart Grid allows all participants in the distribution grid to store and track electricity consumption.During the research,a machine learning model is developed that allows analyzing and predicting the probability of NTL for each consumer of the distribution grid based on daily electricity consumption readings.This model is an ensemble meta-algorithm(stacking)that generalizes the algorithms of random forest,LightGBM,and a homogeneous ensemble of artificial neural networks.The best accuracy of the proposed meta-algorithm in comparison to basic classifiers is experimentally confirmed on the test sample.Such a model,due to good accuracy indicators(ROC-AUC-0.88),can be used as a methodological basis for a decision support system,the purpose of which is to form a sample of suspected NTL sources.The use of such a sample will allow the top management of electric distribution companies to increase the efficiency of raids by performers,making them targeted and accurate,which should contribute to the fight against NTL and the sustainable development of the electric power industry.展开更多
Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decisio...Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.展开更多
Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status ...Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.展开更多
Deep neural networks are increasingly exposed to attack threats,and at the same time,the need for privacy protection is growing.As a result,the challenge of developing neural networks that are both robust and capable ...Deep neural networks are increasingly exposed to attack threats,and at the same time,the need for privacy protection is growing.As a result,the challenge of developing neural networks that are both robust and capable of strong generalization while maintaining privacy becomes pressing.Training neural networks under privacy constraints is one way to minimize privacy leakage,and one way to do this is to add noise to the data or model.However,noise may cause gradient directions to deviate from the optimal trajectory during training,leading to unstable parameter updates,slow convergence,and reduced model generalization capability.To overcome these challenges,we propose an optimization algorithm based on double-integral coevolutionary neurodynamics(DICND),designed to accelerate convergence and improve generalization in noisy conditions.Theoretical analysis proves the global convergence of the DICND algorithm and demonstrates its ability to converge to near-global minima efficiently under noisy conditions.Numerical simulations and image classification experiments further confirm the DICND algorithm's significant advantages in enhancing generalization performance.展开更多
The optimization of reaction processes is crucial for the green, efficient, and sustainable development of the chemical industry. However, how to address the problems posed by multiple variables, nonlinearities, and u...The optimization of reaction processes is crucial for the green, efficient, and sustainable development of the chemical industry. However, how to address the problems posed by multiple variables, nonlinearities, and uncertainties during optimization remains a formidable challenge. In this study, a strategy combining interpretable machine learning with metaheuristic optimization algorithms is employed to optimize the reaction process. First, experimental data from a biodiesel production process are collected to establish a database. These data are then used to construct a predictive model based on artificial neural network (ANN) models. Subsequently, interpretable machine learning techniques are applied for quantitative analysis and verification of the model. Finally, four metaheuristic optimization algorithms are coupled with the ANN model to achieve the desired optimization. The research results show that the methanol: palm fatty acid distillate (PFAD) molar ratio contributes the most to the reaction outcome, accounting for 41%. The ANN-simulated annealing (SA) hybrid method is more suitable for this optimization, and the optimal process parameters are a catalyst concentration of 3.00% (mass), a methanol: PFAD molar ratio of 8.67, and a reaction time of 30 min. This study provides deeper insights into reaction process optimization, which will facilitate future applications in various reaction optimization processes.展开更多
BACKGROUND:This study aims to develop and validate a machine learning-based in-hospital mortality predictive model for acute aortic syndrome(AAS)in the emergency department(ED)and to derive a simplifi ed version suita...BACKGROUND:This study aims to develop and validate a machine learning-based in-hospital mortality predictive model for acute aortic syndrome(AAS)in the emergency department(ED)and to derive a simplifi ed version suitable for rapid clinical application.METHODS:In this multi-center retrospective cohort study,AAS patient data from three hospitals were analyzed.The modeling cohort included data from the First Affiliated Hospital of Zhengzhou University and the People’s Hospital of Xinjiang Uygur Autonomous Region,with Peking University Third Hospital data serving as the external test set.Four machine learning algorithms—logistic regression(LR),multilayer perceptron(MLP),Gaussian naive Bayes(GNB),and random forest(RF)—were used to develop predictive models based on 34 early-accessible clinical variables.A simplifi ed model was then derived based on fi ve key variables(Stanford type,pericardial eff usion,asymmetric peripheral arterial pulsation,decreased bowel sounds,and dyspnea)via Least Absolute Shrinkage and Selection Operator(LASSO)regression to improve ED applicability.RESULTS:A total of 929 patients were included in the modeling cohort,and 210 were included in the external test set.Four machine learning models based on 34 clinical variables were developed,achieving internal and external validation AUCs of 0.85-0.90 and 0.73-0.85,respectively.The simplifi ed model incorporating fi ve key variables demonstrated internal and external validation AUCs of 0.71-0.86 and 0.75-0.78,respectively.Both models showed robust calibration and predictive stability across datasets.CONCLUSION:Both kinds of models were built based on machine learning tools,and proved to have certain prediction performance and extrapolation.展开更多
Efficient surface passivation is critical for achieving high-performance perovskite solar cells(PSCs),yet the discovery of optimal passivators remains a time-consuming,trial-and-error process.Here,we report a synergis...Efficient surface passivation is critical for achieving high-performance perovskite solar cells(PSCs),yet the discovery of optimal passivators remains a time-consuming,trial-and-error process.Here,we report a synergistic machine learning(ML)and density functional theory(DFT)approach that enables predictive and rapid identification of effective passivation materials.By training an XGBoost model(91.3%accuracy)with DFT-derived molecular descriptors and activity calculations,we identify 2-(4-aminophenyl)-3H-benzimidazol-5-amine(APBIA)as a promising passivator.Experimental validation demonstrates that APBIA effectively removes surface impurities and passivates defects within perovskite films,leading to a significant increase in power conversion efficiency(PCE)from 22.48%to 25.55%(certified as 25.02%).This ML-DFT framework provides a generalizable pathway for accelerating the development of advanced functional materials for photovoltaic applications.展开更多
Lithology identificationwhile drilling technology can obtain rock information in real-time.However,traditional lithology identificationmodels often face limitations in feature extraction and adaptability to complex ge...Lithology identificationwhile drilling technology can obtain rock information in real-time.However,traditional lithology identificationmodels often face limitations in feature extraction and adaptability to complex geological conditions,limiting their accuracy in challenging environments.To address these challenges,a deep learning model for lithology identificationwhile drilling is proposed.The proposed model introduces a dual attention mechanism in the long short-term memory(LSTM)network,effectively enhancing the ability to capture spatial and channel dimension information.Subsequently,the crayfishoptimization algorithm(COA)is applied to optimize the model network structure,thereby enhancing its lithology identificationcapability.Laboratory test results demonstrate that the proposed model achieves 97.15%accuracy on the testing set,significantlyoutperforming the traditional support vector machine(SVM)method(81.77%).Field tests under actual drilling conditions demonstrate an average accuracy of 91.96%for the proposed model,representing a 14.31%improvement over the LSTM model alone.The proposed model demonstrates robust adaptability and generalization ability across diverse operational scenarios.This research offers reliable technical support for lithology identification while drilling.展开更多
Recently,Internet ofThings(IoT)has been increasingly integrated into the automotive sector,enabling the development of diverse applications such as the Internet of Vehicles(IoV)and intelligent connected vehicles.Lever...Recently,Internet ofThings(IoT)has been increasingly integrated into the automotive sector,enabling the development of diverse applications such as the Internet of Vehicles(IoV)and intelligent connected vehicles.Leveraging IoVtechnologies,operational data fromcore vehicle components can be collected and analyzed to construct fault diagnosis models,thereby enhancing vehicle safety.However,automakers often struggle to acquire sufficient fault data to support effective model training.To address this challenge,a robust and efficient federated learning method(REFL)is constructed for machinery fault diagnosis in collaborative IoV,which can organize multiple companies to collaboratively develop a comprehensive fault diagnosis model while keeping their data locally.In the REFL,the gradient-based adversary algorithm is first introduced to the fault diagnosis field to enhance the deep learning model robustness.Moreover,the adaptive gradient processing process is designed to improve the model training speed and ensure the model accuracy under unbalance data scenarios.The proposed REFL is evaluated on non-independent and identically distributed(non-IID)real-world machinery fault dataset.Experiment results demonstrate that the REFL can achieve better performance than traditional learning methods and are promising for real industrial fault diagnosis.展开更多
To curb the worsening tropospheric ozone(O_(3))pollution problem in China,a rapid and accurate identification of O_(3)-precursor sensitivity(OPS)is a crucial prerequisite for formulating effective contingency O_(3) po...To curb the worsening tropospheric ozone(O_(3))pollution problem in China,a rapid and accurate identification of O_(3)-precursor sensitivity(OPS)is a crucial prerequisite for formulating effective contingency O_(3) pollution control strategies.However,currently widely-used methods,such as statistical models and numerical models,exhibit inherent limitations in identifying OPS in a timely and accurate manner.In this study,we developed a novel approach to identify OPS based on eXtreme Gradient Boosting model,Shapley additive explanation(SHAP)al-gorithm,and volatile organic compound(VOC)photochemical decay adjustment,using the meteorology and speciated pollutant monitoring data as the input.By comparing the difference in SHAP values between base sce-nario and precursor reduction scenario for nitrogen oxides(NO_(x))and VOCs,OPS was divided into NO_(x)-limited,VOCs-limited and transition regime.Using the long-lasting O_(3) pollution episode in the autumn of 2022 at the Guangdong-Hong Kong-Macao Greater Bay Area(GBA)as an example,we demonstrated large spatiotemporal heterogeneities of OPS over the GBA,which were generally shifted from NO_(x)-limited to VOCs-limited from September to October and more inclined to be VOCs-limited at the central and NO_(x)-limited in the peripheral areas.This study developed an innovative OPS identification method by comparing the difference in SHAP value before and after precursor emission reduction.Our method enables the accurate identification of OPS in the time scale of seconds,thereby providing a state-of-the-art tool for the rapid guidance of spatial-specific O_(3) control strategies.展开更多
The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artifici...The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artificial Neural Network(ANN),Quadratic Discriminant Analysis(QDA),Linear Discriminant Analysis(LDA),and Naive Bayes(NB),for landslide susceptibility modeling and comparison of their performances.Coupling machine learning algorithms with spatial data types for landslide susceptibility mapping is a vitally important issue.This study was carried out using GIS and R open source software at Abha Basin,Asir Region,Saudi Arabia.First,a total of 243 landslide locations were identified at Abha Basin to prepare the landslide inventory map using different data sources.All the landslide areas were randomly separated into two groups with a ratio of 70%for training and 30%for validating purposes.Twelve landslide-variables were generated for landslide susceptibility modeling,which include altitude,lithology,distance to faults,normalized difference vegetation index(NDVI),landuse/landcover(LULC),distance to roads,slope angle,distance to streams,profile curvature,plan curvature,slope length(LS),and slope-aspect.The area under curve(AUC-ROC)approach has been applied to evaluate,validate,and compare the MLTs performance.The results indicated that AUC values for seven MLTs range from 89.0%for QDA to 95.1%for RF.Our findings showed that the RF(AUC=95.1%)and LDA(AUC=941.7%)have produced the best performances in comparison to other MLTs.The outcome of this study and the landslide susceptibility maps would be useful for environmental protection.展开更多
Compression index Ccis an essential parameter in geotechnical design for which the effectiveness of correlation is still a challenge.This paper suggests a novel modelling approach using machine learning(ML)technique.T...Compression index Ccis an essential parameter in geotechnical design for which the effectiveness of correlation is still a challenge.This paper suggests a novel modelling approach using machine learning(ML)technique.The performance of five commonly used machine learning(ML)algorithms,i.e.back-propagation neural network(BPNN),extreme learning machine(ELM),support vector machine(SVM),random forest(RF)and evolutionary polynomial regression(EPR)in predicting Cc is comprehensively investigated.A database with a total number of 311 datasets including three input variables,i.e.initial void ratio e0,liquid limit water content wL,plasticity index Ip,and one output variable Cc is first established.Genetic algorithm(GA)is used to optimize the hyper-parameters in five ML algorithms,and the average prediction error for the 10-fold cross-validation(CV)sets is set as thefitness function in the GA for enhancing the robustness of ML models.The results indicate that ML models outperform empirical prediction formulations with lower prediction error.RF yields the lowest error followed by BPNN,ELM,EPR and SVM.If the ranges of input variables in the database are large enough,BPNN and RF models are recommended to predict Cc.Furthermore,if the distribution of input variables is continuous,RF model is the best one.Otherwise,EPR model is recommended if the ranges of input variables are small.The predicted correlations between input and output variables using five ML models show great agreement with the physical explanation.展开更多
基金National Key Research and Development Program of China,No.2023YFC3006704National Natural Science Foundation of China,No.42171047CAS-CSIRO Partnership Joint Project of 2024,No.177GJHZ2023097MI。
文摘Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting flood resource variables using single or hybrid machine learning techniques.However,class-based flood predictions have rarely been investigated,which can aid in quickly diagnosing comprehensive flood characteristics and proposing targeted management strategies.This study proposed a prediction approach of flood regime metrics and event classes coupling machine learning algorithms with clustering-deduced membership degrees.Five algorithms were adopted for this exploration.Results showed that the class membership degrees accurately determined event classes with class hit rates up to 100%,compared with the four classes clustered from nine regime metrics.The nonlinear algorithms(Multiple Linear Regression,Random Forest,and least squares-Support Vector Machine)outperformed the linear techniques(Multiple Linear Regression and Stepwise Regression)in predicting flood regime metrics.The proposed approach well predicted flood event classes with average class hit rates of 66.0%-85.4%and 47.2%-76.0%in calibration and validation periods,respectively,particularly for the slow and late flood events.The predictive capability of the proposed prediction approach for flood regime metrics and classes was considerably stronger than that of hydrological modeling approach.
文摘Open caissons are widely used in foundation engineering because of their load-bearing efficiency and adaptability in diverse soil conditions.However,accurately predicting their undrained bearing capacity in layered soils remains a complex challenge.This study presents a novel application of five ensemble machine(ML)algorithms-random forest(RF),gradient boosting machine(GBM),extreme gradient boosting(XGBoost),adaptive boosting(AdaBoost),and categorical boosting(CatBoost)-to predict the undrained bearing capacity factor(Nc)of circular open caissons embedded in two-layered clay on the basis of results from finite element limit analysis(FELA).The input dataset consists of 1188 numerical simulations using the Tresca failure criterion,varying in geometrical and soil parameters.The FELA was performed via OptumG2 software with adaptive meshing techniques and verified against existing benchmark studies.The ML models were trained on 70% of the dataset and tested on the remaining 30%.Their performance was evaluated using six statistical metrics:coefficient of determination(R²),mean absolute error(MAE),root mean squared error(RMSE),index of scatter(IOS),RMSE-to-standard deviation ratio(RSR),and variance explained factor(VAF).The results indicate that all the models achieved high accuracy,with R²values exceeding 97.6%and RMSE values below 0.02.Among them,AdaBoost and CatBoost consistently outperformed the other methods across both the training and testing datasets,demonstrating superior generalizability and robustness.The proposed ML framework offers an efficient,accurate,and data-driven alternative to traditional methods for estimating caisson capacity in stratified soils.This approach can aid in reducing computational costs while improving reliability in the early stages of foundation design.
文摘Edge Machine Learning(EdgeML)and Tiny Machine Learning(TinyML)are fast-growing fields that bring machine learning to resource-constrained devices,allowing real-time data processing and decision-making at the network’s edge.However,the complexity of model conversion techniques,diverse inference mechanisms,and varied learning strategies make designing and deploying these models challenging.Additionally,deploying TinyML models on resource-constrained hardware with specific software frameworks has broadened EdgeML’s applications across various sectors.These factors underscore the necessity for a comprehensive literature review,as current reviews do not systematically encompass the most recent findings on these topics.Consequently,it provides a comprehensive overview of state-of-the-art techniques in model conversion,inference mechanisms,learning strategies within EdgeML,and deploying these models on resource-constrained edge devices using TinyML.It identifies 90 research articles published between 2018 and 2025,categorizing them into two main areas:(1)model conversion,inference,and learning strategies in EdgeML and(2)deploying TinyML models on resource-constrained hardware using specific software frameworks.In the first category,the synthesis of selected research articles compares and critically reviews various model conversion techniques,inference mechanisms,and learning strategies.In the second category,the synthesis identifies and elaborates on major development boards,software frameworks,sensors,and algorithms used in various applications across six major sectors.As a result,this article provides valuable insights for researchers,practitioners,and developers.It assists them in choosing suitable model conversion techniques,inference mechanisms,learning strategies,hardware development boards,software frameworks,sensors,and algorithms tailored to their specific needs and applications across various sectors.
基金supported by the“Technology Commercialization Collaboration Platform Construction”project of the Innopolis Foundation(Project Number:2710033536)the Competitive Research Fund of The University of Aizu,Japan.
文摘Sentiment Analysis,a significant domain within Natural Language Processing(NLP),focuses on extracting and interpreting subjective information-such as emotions,opinions,and attitudes-from textual data.With the increasing volume of user-generated content on social media and digital platforms,sentiment analysis has become essential for deriving actionable insights across various sectors.This study presents a systematic literature review of sentiment analysis methodologies,encompassing traditional machine learning algorithms,lexicon-based approaches,and recent advancements in deep learning techniques.The review follows a structured protocol comprising three phases:planning,execution,and analysis/reporting.During the execution phase,67 peer-reviewed articles were initially retrieved,with 25 meeting predefined inclusion and exclusion criteria.The analysis phase involved a detailed examination of each study’s methodology,experimental setup,and key contributions.Among the deep learning models evaluated,Long Short-Term Memory(LSTM)networks were identified as the most frequently adopted architecture for sentiment classification tasks.This review highlights current trends,technical challenges,and emerging opportunities in the field,providing valuable guidance for future research and development in applications such as market analysis,public health monitoring,financial forecasting,and crisis management.
文摘Non-technical losses(NTL)of electric power are a serious problem for electric distribution companies.The solution determines the cost,stability,reliability,and quality of the supplied electricity.The widespread use of advanced metering infrastructure(AMI)and Smart Grid allows all participants in the distribution grid to store and track electricity consumption.During the research,a machine learning model is developed that allows analyzing and predicting the probability of NTL for each consumer of the distribution grid based on daily electricity consumption readings.This model is an ensemble meta-algorithm(stacking)that generalizes the algorithms of random forest,LightGBM,and a homogeneous ensemble of artificial neural networks.The best accuracy of the proposed meta-algorithm in comparison to basic classifiers is experimentally confirmed on the test sample.Such a model,due to good accuracy indicators(ROC-AUC-0.88),can be used as a methodological basis for a decision support system,the purpose of which is to form a sample of suspected NTL sources.The use of such a sample will allow the top management of electric distribution companies to increase the efficiency of raids by performers,making them targeted and accurate,which should contribute to the fight against NTL and the sustainable development of the electric power industry.
基金financially supported by the National Natural Science Foundation of China(No.52073031)the National Key Research and Development Program of China(Nos.2023YFB3208102,2021YFB3200304)+4 种基金the China National Postdoctoral Program for Innovative Talents(No.BX2021302)the Beijing Nova Program(Nos.Z191100001119047,Z211100002121148)the Fundamental Research Funds for the Central Universities(No.E0EG6801X2)the‘Hundred Talents Program’of the Chinese Academy of Sciencesthe BrainLink program funded by the MSIT through the NRF of Korea(No.RS-2023-00237308).
文摘Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.
文摘Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.
基金supported by the National Natural Science Foundation of China(62394340,62394345,62473383).This work was carried out in part using computing resources at the High Performance Computing Center of Central South University。
文摘Deep neural networks are increasingly exposed to attack threats,and at the same time,the need for privacy protection is growing.As a result,the challenge of developing neural networks that are both robust and capable of strong generalization while maintaining privacy becomes pressing.Training neural networks under privacy constraints is one way to minimize privacy leakage,and one way to do this is to add noise to the data or model.However,noise may cause gradient directions to deviate from the optimal trajectory during training,leading to unstable parameter updates,slow convergence,and reduced model generalization capability.To overcome these challenges,we propose an optimization algorithm based on double-integral coevolutionary neurodynamics(DICND),designed to accelerate convergence and improve generalization in noisy conditions.Theoretical analysis proves the global convergence of the DICND algorithm and demonstrates its ability to converge to near-global minima efficiently under noisy conditions.Numerical simulations and image classification experiments further confirm the DICND algorithm's significant advantages in enhancing generalization performance.
基金supported by the National Natural Science Foundation of China(22408227,22238005)the Postdoctoral Research Foundation of China(GZC20231576).
文摘The optimization of reaction processes is crucial for the green, efficient, and sustainable development of the chemical industry. However, how to address the problems posed by multiple variables, nonlinearities, and uncertainties during optimization remains a formidable challenge. In this study, a strategy combining interpretable machine learning with metaheuristic optimization algorithms is employed to optimize the reaction process. First, experimental data from a biodiesel production process are collected to establish a database. These data are then used to construct a predictive model based on artificial neural network (ANN) models. Subsequently, interpretable machine learning techniques are applied for quantitative analysis and verification of the model. Finally, four metaheuristic optimization algorithms are coupled with the ANN model to achieve the desired optimization. The research results show that the methanol: palm fatty acid distillate (PFAD) molar ratio contributes the most to the reaction outcome, accounting for 41%. The ANN-simulated annealing (SA) hybrid method is more suitable for this optimization, and the optimal process parameters are a catalyst concentration of 3.00% (mass), a methanol: PFAD molar ratio of 8.67, and a reaction time of 30 min. This study provides deeper insights into reaction process optimization, which will facilitate future applications in various reaction optimization processes.
基金supported by the special fund of the National Clinical Key Specialty Construction Program[(2022)301-2305].
文摘BACKGROUND:This study aims to develop and validate a machine learning-based in-hospital mortality predictive model for acute aortic syndrome(AAS)in the emergency department(ED)and to derive a simplifi ed version suitable for rapid clinical application.METHODS:In this multi-center retrospective cohort study,AAS patient data from three hospitals were analyzed.The modeling cohort included data from the First Affiliated Hospital of Zhengzhou University and the People’s Hospital of Xinjiang Uygur Autonomous Region,with Peking University Third Hospital data serving as the external test set.Four machine learning algorithms—logistic regression(LR),multilayer perceptron(MLP),Gaussian naive Bayes(GNB),and random forest(RF)—were used to develop predictive models based on 34 early-accessible clinical variables.A simplifi ed model was then derived based on fi ve key variables(Stanford type,pericardial eff usion,asymmetric peripheral arterial pulsation,decreased bowel sounds,and dyspnea)via Least Absolute Shrinkage and Selection Operator(LASSO)regression to improve ED applicability.RESULTS:A total of 929 patients were included in the modeling cohort,and 210 were included in the external test set.Four machine learning models based on 34 clinical variables were developed,achieving internal and external validation AUCs of 0.85-0.90 and 0.73-0.85,respectively.The simplifi ed model incorporating fi ve key variables demonstrated internal and external validation AUCs of 0.71-0.86 and 0.75-0.78,respectively.Both models showed robust calibration and predictive stability across datasets.CONCLUSION:Both kinds of models were built based on machine learning tools,and proved to have certain prediction performance and extrapolation.
基金supported by the National Key Research and Development Program of China (Grant No. 2024YFB4205101)the National Natural Science Foundation of China (No. 62274098 and No. 62074084)+2 种基金the Natural Science Foundation of Tianjin (No.22JCYBJC01300, No. 23JCYBJC01620 and No. 21JCYBJC00270)the Overseas Expertise Introduction Project for Discipline Innovation of Higher Edu cation of China (Grant No. B16027)the Fundamental Research Funds for the Central Universities,Nankai University (No. 63241568)
文摘Efficient surface passivation is critical for achieving high-performance perovskite solar cells(PSCs),yet the discovery of optimal passivators remains a time-consuming,trial-and-error process.Here,we report a synergistic machine learning(ML)and density functional theory(DFT)approach that enables predictive and rapid identification of effective passivation materials.By training an XGBoost model(91.3%accuracy)with DFT-derived molecular descriptors and activity calculations,we identify 2-(4-aminophenyl)-3H-benzimidazol-5-amine(APBIA)as a promising passivator.Experimental validation demonstrates that APBIA effectively removes surface impurities and passivates defects within perovskite films,leading to a significant increase in power conversion efficiency(PCE)from 22.48%to 25.55%(certified as 25.02%).This ML-DFT framework provides a generalizable pathway for accelerating the development of advanced functional materials for photovoltaic applications.
基金supported by the National Key Research and Development Program for Young Scientists,Chin(Grant No.2021YFC2900400)the Sichuan-Chongqing Science and Technology Innovation Cooperation Program Project,China(Grant No.2024TIAD-CYKJCXX0269)the National Natural Science Foundation of China,China(Grant No.52304123).
文摘Lithology identificationwhile drilling technology can obtain rock information in real-time.However,traditional lithology identificationmodels often face limitations in feature extraction and adaptability to complex geological conditions,limiting their accuracy in challenging environments.To address these challenges,a deep learning model for lithology identificationwhile drilling is proposed.The proposed model introduces a dual attention mechanism in the long short-term memory(LSTM)network,effectively enhancing the ability to capture spatial and channel dimension information.Subsequently,the crayfishoptimization algorithm(COA)is applied to optimize the model network structure,thereby enhancing its lithology identificationcapability.Laboratory test results demonstrate that the proposed model achieves 97.15%accuracy on the testing set,significantlyoutperforming the traditional support vector machine(SVM)method(81.77%).Field tests under actual drilling conditions demonstrate an average accuracy of 91.96%for the proposed model,representing a 14.31%improvement over the LSTM model alone.The proposed model demonstrates robust adaptability and generalization ability across diverse operational scenarios.This research offers reliable technical support for lithology identification while drilling.
基金supported in part by National key R&D projects(2024YFB4207203)National Natural Science Foundation of China(52401376)+3 种基金the Zhejiang Provincial Natural Science Foundation of China under Grant(No.LTGG24F030004)Hangzhou Key Scientific Research Plan Project(2024SZD1A24)“Pioneer”and“Leading Goose”R&DProgramof Zhejiang(2024C03254,2023C03154)Jiangxi Provincial Gan-Po Elite Support Program(Major Academic and Technical Leaders Cultivation Project,20243BCE51180).
文摘Recently,Internet ofThings(IoT)has been increasingly integrated into the automotive sector,enabling the development of diverse applications such as the Internet of Vehicles(IoV)and intelligent connected vehicles.Leveraging IoVtechnologies,operational data fromcore vehicle components can be collected and analyzed to construct fault diagnosis models,thereby enhancing vehicle safety.However,automakers often struggle to acquire sufficient fault data to support effective model training.To address this challenge,a robust and efficient federated learning method(REFL)is constructed for machinery fault diagnosis in collaborative IoV,which can organize multiple companies to collaboratively develop a comprehensive fault diagnosis model while keeping their data locally.In the REFL,the gradient-based adversary algorithm is first introduced to the fault diagnosis field to enhance the deep learning model robustness.Moreover,the adaptive gradient processing process is designed to improve the model training speed and ensure the model accuracy under unbalance data scenarios.The proposed REFL is evaluated on non-independent and identically distributed(non-IID)real-world machinery fault dataset.Experiment results demonstrate that the REFL can achieve better performance than traditional learning methods and are promising for real industrial fault diagnosis.
基金supported by the Key-Area Research and Development Program of Guangdong Province(No.2020B1111360003)the National Natural Science Foundation of China(Nos.42465008 and 42105164)+2 种基金Yunnan Science and Technology Department Project(No.202501AT070239)Yunnan Science and Technology Department Youth Project(No.202401AU070202)Xianyang Rapid Response Decision Support Project for Ozone(No.YZ2024-ZB019).
文摘To curb the worsening tropospheric ozone(O_(3))pollution problem in China,a rapid and accurate identification of O_(3)-precursor sensitivity(OPS)is a crucial prerequisite for formulating effective contingency O_(3) pollution control strategies.However,currently widely-used methods,such as statistical models and numerical models,exhibit inherent limitations in identifying OPS in a timely and accurate manner.In this study,we developed a novel approach to identify OPS based on eXtreme Gradient Boosting model,Shapley additive explanation(SHAP)al-gorithm,and volatile organic compound(VOC)photochemical decay adjustment,using the meteorology and speciated pollutant monitoring data as the input.By comparing the difference in SHAP values between base sce-nario and precursor reduction scenario for nitrogen oxides(NO_(x))and VOCs,OPS was divided into NO_(x)-limited,VOCs-limited and transition regime.Using the long-lasting O_(3) pollution episode in the autumn of 2022 at the Guangdong-Hong Kong-Macao Greater Bay Area(GBA)as an example,we demonstrated large spatiotemporal heterogeneities of OPS over the GBA,which were generally shifted from NO_(x)-limited to VOCs-limited from September to October and more inclined to be VOCs-limited at the central and NO_(x)-limited in the peripheral areas.This study developed an innovative OPS identification method by comparing the difference in SHAP value before and after precursor emission reduction.Our method enables the accurate identification of OPS in the time scale of seconds,thereby providing a state-of-the-art tool for the rapid guidance of spatial-specific O_(3) control strategies.
文摘The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artificial Neural Network(ANN),Quadratic Discriminant Analysis(QDA),Linear Discriminant Analysis(LDA),and Naive Bayes(NB),for landslide susceptibility modeling and comparison of their performances.Coupling machine learning algorithms with spatial data types for landslide susceptibility mapping is a vitally important issue.This study was carried out using GIS and R open source software at Abha Basin,Asir Region,Saudi Arabia.First,a total of 243 landslide locations were identified at Abha Basin to prepare the landslide inventory map using different data sources.All the landslide areas were randomly separated into two groups with a ratio of 70%for training and 30%for validating purposes.Twelve landslide-variables were generated for landslide susceptibility modeling,which include altitude,lithology,distance to faults,normalized difference vegetation index(NDVI),landuse/landcover(LULC),distance to roads,slope angle,distance to streams,profile curvature,plan curvature,slope length(LS),and slope-aspect.The area under curve(AUC-ROC)approach has been applied to evaluate,validate,and compare the MLTs performance.The results indicated that AUC values for seven MLTs range from 89.0%for QDA to 95.1%for RF.Our findings showed that the RF(AUC=95.1%)and LDA(AUC=941.7%)have produced the best performances in comparison to other MLTs.The outcome of this study and the landslide susceptibility maps would be useful for environmental protection.
基金financial support provided by the RIF project(Grant No.PolyU R5037-18F)from the Research Grants Council(RGC)of Hong Kong is gratefully acknowledged。
文摘Compression index Ccis an essential parameter in geotechnical design for which the effectiveness of correlation is still a challenge.This paper suggests a novel modelling approach using machine learning(ML)technique.The performance of five commonly used machine learning(ML)algorithms,i.e.back-propagation neural network(BPNN),extreme learning machine(ELM),support vector machine(SVM),random forest(RF)and evolutionary polynomial regression(EPR)in predicting Cc is comprehensively investigated.A database with a total number of 311 datasets including three input variables,i.e.initial void ratio e0,liquid limit water content wL,plasticity index Ip,and one output variable Cc is first established.Genetic algorithm(GA)is used to optimize the hyper-parameters in five ML algorithms,and the average prediction error for the 10-fold cross-validation(CV)sets is set as thefitness function in the GA for enhancing the robustness of ML models.The results indicate that ML models outperform empirical prediction formulations with lower prediction error.RF yields the lowest error followed by BPNN,ELM,EPR and SVM.If the ranges of input variables in the database are large enough,BPNN and RF models are recommended to predict Cc.Furthermore,if the distribution of input variables is continuous,RF model is the best one.Otherwise,EPR model is recommended if the ranges of input variables are small.The predicted correlations between input and output variables using five ML models show great agreement with the physical explanation.