Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decisio...Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.展开更多
Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status ...Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.展开更多
The reasonable quantification of the concrete freezing environment on the Qinghai-Tibet Plateau(QTP)is the primary issue in frost resistant concrete design,which is one of the challenges that the QTP engineering manag...The reasonable quantification of the concrete freezing environment on the Qinghai-Tibet Plateau(QTP)is the primary issue in frost resistant concrete design,which is one of the challenges that the QTP engineering managers should take into account.In this paper,we propose a more realistic method to calculate the number of concrete freeze-thaw cycles(NFTCs)on the QTP.The calculated results show that the NFTCs increase as the altitude of the meteorological station increases with the average NFTCs being 208.7.Four machine learning methods,i.e.,the random forest(RF)model,generalized boosting method(GBM),generalized linear model(GLM),and generalized additive model(GAM),are used to fit the NFTCs.The root mean square error(RMSE)values of the RF,GBM,GLM,and GAM are 32.3,4.3,247.9,and 161.3,respectively.The R^(2)values of the RF,GBM,GLM,and GAM are 0.93,0.99,0.48,and 0.66,respectively.The GBM method performs the best compared to the other three methods,which was shown by the results of RMSE and R^(2)values.The quantitative results from the GBM method indicate that the lowest,medium,and highest NFTC values are distributed in the northern,central,and southern parts of the QTP,respectively.The annual NFTCs in the QTP region are mainly concentrated at 160 and above,and the average NFTCs is 200 across the QTP.Our results can provide scientific guidance and a theoretical basis for the freezing resistance design of concrete in various projects on the QTP.展开更多
The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artifici...The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artificial Neural Network(ANN),Quadratic Discriminant Analysis(QDA),Linear Discriminant Analysis(LDA),and Naive Bayes(NB),for landslide susceptibility modeling and comparison of their performances.Coupling machine learning algorithms with spatial data types for landslide susceptibility mapping is a vitally important issue.This study was carried out using GIS and R open source software at Abha Basin,Asir Region,Saudi Arabia.First,a total of 243 landslide locations were identified at Abha Basin to prepare the landslide inventory map using different data sources.All the landslide areas were randomly separated into two groups with a ratio of 70%for training and 30%for validating purposes.Twelve landslide-variables were generated for landslide susceptibility modeling,which include altitude,lithology,distance to faults,normalized difference vegetation index(NDVI),landuse/landcover(LULC),distance to roads,slope angle,distance to streams,profile curvature,plan curvature,slope length(LS),and slope-aspect.The area under curve(AUC-ROC)approach has been applied to evaluate,validate,and compare the MLTs performance.The results indicated that AUC values for seven MLTs range from 89.0%for QDA to 95.1%for RF.Our findings showed that the RF(AUC=95.1%)and LDA(AUC=941.7%)have produced the best performances in comparison to other MLTs.The outcome of this study and the landslide susceptibility maps would be useful for environmental protection.展开更多
The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as ...The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as the coal seam tendency to rockbursts, the thickness of the coal seam, and the stress level in the seam have to be considered, but also the entire coal seam-surrounding rock system has to be evaluated when trying to predict the rockbursts. However, in hard coal mines, there are stroke or stress-stroke rockbursts in which the fracture of a thick layer of sandstone plays an essential role in predicting rockbursts. The occurrence of rockbursts in coal mines is complex, and their prediction is even more difficult than in other mines. In recent years, the interest in machine learning algorithms for solving complex nonlinear problems has increased, which also applies to geosciences. This study attempts to use machine learning algorithms, i.e. neural network, decision tree, random forest, gradient boosting, and extreme gradient boosting(XGB), to assess the rockburst hazard of an active hard coal mine in the Upper Silesian Coal Basin. The rock mass bursting tendency index WTGthat describes the tendency of the seam-surrounding rock system to rockbursts and the anomaly of the vertical stress component were applied for this purpose. Especially, the decision tree and neural network models were proved to be effective in correctly distinguishing rockbursts from tremors, after which the excavation was not damaged. On average, these models correctly classified about 80% of the rockbursts in the testing datasets.展开更多
Big data analytic techniques associated with machine learning algorithms are playing an increasingly important role in various application fields,including stock market investment.However,few studies have focused on f...Big data analytic techniques associated with machine learning algorithms are playing an increasingly important role in various application fields,including stock market investment.However,few studies have focused on forecasting daily stock market returns,especially when using powerful machine learning techniques,such as deep neural networks(DNNs),to perform the analyses.DNNs employ various deep learning algorithms based on the combination of network structure,activation function,and model parameters,with their performance depending on the format of the data representation.This paper presents a comprehensive big data analytics process to predict the daily return direction of the SPDR S&P 500 ETF(ticker symbol:SPY)based on 60 financial and economic features.DNNs and traditional artificial neural networks(ANNs)are then deployed over the entire preprocessed but untransformed dataset,along with two datasets transformed via principal component analysis(PCA),to predict the daily direction of future stock market index returns.While controlling for overfitting,a pattern for the classification accuracy of the DNNs is detected and demonstrated as the number of the hidden layers increases gradually from 12 to 1000.Moreover,a set of hypothesis testing procedures are implemented on the classification,and the simulation results show that the DNNs using two PCA-represented datasets give significantly higher classification accuracy than those using the entire untransformed dataset,as well as several other hybrid machine learning algorithms.In addition,the trading strategies guided by the DNN classification process based on PCA-represented data perform slightly better than the others tested,including in a comparison against two standard benchmarks.展开更多
Some countries have announced national benchmark rates,while others have been working on the recent trend in which the London Interbank Offered Rate will be retired at the end of 2021.Considering that Turkey announced...Some countries have announced national benchmark rates,while others have been working on the recent trend in which the London Interbank Offered Rate will be retired at the end of 2021.Considering that Turkey announced the Turkish Lira Overnight Reference Interest Rate(TLREF),this study examines the determinants of TLREF.In this context,three global determinants,five country-level macroeconomic determinants,and the COVID-19 pandemic are considered by using daily data between December 28,2018,and December 31,2020,by performing machine learning algorithms and Ordinary Least Square.The empirical results show that(1)the most significant determinant is the amount of securities bought by Central Banks;(2)country-level macroeconomic factors have a higher impact whereas global factors are less important,and the pandemic does not have a significant effect;(3)Random Forest is the most accurate prediction model.Taking action by considering the study’s findings can help support economic growth by achieving low-level benchmark rates.展开更多
This study aims to empirically analyze teaching-learning-based optimization(TLBO)and machine learning algorithms using k-means and fuzzy c-means(FCM)algorithms for their individual performance evaluation in terms of c...This study aims to empirically analyze teaching-learning-based optimization(TLBO)and machine learning algorithms using k-means and fuzzy c-means(FCM)algorithms for their individual performance evaluation in terms of clustering and classification.In the first phase,the clustering(k-means and FCM)algorithms were employed independently and the clustering accuracy was evaluated using different computationalmeasures.During the second phase,the non-clustered data obtained from the first phase were preprocessed with TLBO.TLBO was performed using k-means(TLBO-KM)and FCM(TLBO-FCM)(TLBO-KM/FCM)algorithms.The objective function was determined by considering both minimization and maximization criteria.Non-clustered data obtained from the first phase were further utilized and fed as input for threshold optimization.Five benchmark datasets were considered from theUniversity of California,Irvine(UCI)Machine Learning Repository for comparative study and experimentation.These are breast cancer Wisconsin(BCW),Pima Indians Diabetes,Heart-Statlog,Hepatitis,and Cleveland Heart Disease datasets.The combined average accuracy obtained collectively is approximately 99.4%in case of TLBO-KM and 98.6%in case of TLBOFCM.This approach is also capable of finding the dominating attributes.The findings indicate that TLBO-KM/FCM,considering different computational measures,perform well on the non-clustered data where k-means and FCM,if employed independently,fail to provide significant results.Evaluating different feature sets,the TLBO-KM/FCM and SVM(GS)clearly outperformed all other classifiers in terms of sensitivity,specificity and accuracy.TLBOKM/FCM attained the highest average sensitivity(98.7%),highest average specificity(98.4%)and highest average accuracy(99.4%)for 10-fold cross validation with different test data.展开更多
Cryptocurrency price prediction has garnered significant attention due to the growing importance of digital assets in the financial landscape. This paper presents a comprehensive study on predicting future cryptocurre...Cryptocurrency price prediction has garnered significant attention due to the growing importance of digital assets in the financial landscape. This paper presents a comprehensive study on predicting future cryptocurrency prices using machine learning algorithms. Open-source historical data from various cryptocurrency exchanges is utilized. Interpolation techniques are employed to handle missing data, ensuring the completeness and reliability of the dataset. Four technical indicators are selected as features for prediction. The study explores the application of five machine learning algorithms to capture the complex patterns in the highly volatile cryptocurrency market. The findings demonstrate the strengths and limitations of the different approaches, highlighting the significance of feature engineering and algorithm selection in achieving accurate cryptocurrency price predictions. The research contributes valuable insights into the dynamic and rapidly evolving field of cryptocurrency price prediction, assisting investors and traders in making informed decisions amidst the challenges posed by the cryptocurrency market.展开更多
This investigation assessed the efficacy of 10 widely used machine learning algorithms(MLA)comprising the least absolute shrinkage and selection operator(LASSO),generalized linear model(GLM),stepwise generalized linea...This investigation assessed the efficacy of 10 widely used machine learning algorithms(MLA)comprising the least absolute shrinkage and selection operator(LASSO),generalized linear model(GLM),stepwise generalized linear model(SGLM),elastic net(ENET),partial least square(PLS),ridge regression,support vector machine(SVM),classification and regression trees(CART),bagged CART,and random forest(RF)for gully erosion susceptibility mapping(GESM)in Iran.The location of 462 previously existing gully erosion sites were mapped through widespread field investigations,of which 70%(323)and 30%(139)of observations were arbitrarily divided for algorithm calibration and validation.Twelve controlling factors for gully erosion,namely,soil texture,annual mean rainfall,digital elevation model(DEM),drainage density,slope,lithology,topographic wetness index(TWI),distance from rivers,aspect,distance from roads,plan curvature,and profile curvature were ranked in terms of their importance using each MLA.The MLA were compared using a training dataset for gully erosion and statistical measures such as RMSE(root mean square error),MAE(mean absolute error),and R-squared.Based on the comparisons among MLA,the RF algorithm exhibited the minimum RMSE and MAE and the maximum value of R-squared,and was therefore selected as the best model.The variable importance evaluation using the RF model revealed that distance from rivers had the highest significance in influencing the occurrence of gully erosion whereas plan curvature had the least importance.According to the GESM generated using RF,most of the study area is predicted to have a low(53.72%)or moderate(29.65%)susceptibility to gully erosion,whereas only a small area is identified to have a high(12.56%)or very high(4.07%)susceptibility.The outcome generated by RF model is validated using the ROC(Receiver Operating Characteristics)curve approach,which returned an area under the curve(AUC)of 0.985,proving the excellent forecasting ability of the model.The GESM prepared using the RF algorithm can aid decision-makers in targeting remedial actions for minimizing the damage caused by gully erosion.展开更多
Periodontitis is closely related to many systemic diseases linked by different periodontal pathogens.To unravel the relationship between periodontitis and systemic diseases,it is very important to correctly discrimina...Periodontitis is closely related to many systemic diseases linked by different periodontal pathogens.To unravel the relationship between periodontitis and systemic diseases,it is very important to correctly discriminate major periodontal pathogens.To realize convenient,effcient,and high-accuracy bacterial species classification,the authors use Raman spectroscopy combined with machine learning algorithms to distinguish three major periodontal pathogens Porphyromonas gingivalis(Pg),Fusobacterium nucleatum(Fn),and Aggregatibacter actinomycetemcomitans(Aa).The result shows that this novel method can successfully discriminate the three abovementioned periodontal pathogens.Moreover,the classification accuracies for the three categories of the original data were 94.7%at the sample level and 93.9%at the spectrum level by the machine learning algorithm extra trees.This study provides a fast,simple,and accurate method which is very beneficial to differentiate periodontal pathogens.展开更多
Climate change has intensified maize stalk lodging,severely impacting global maize production.While numerous traits influence stalk lodging resistance,their relative importance remains unclear,hindering breeding effor...Climate change has intensified maize stalk lodging,severely impacting global maize production.While numerous traits influence stalk lodging resistance,their relative importance remains unclear,hindering breeding efforts.This study introduces an combining wind tunnel testing with machine learning algorithms to quantitatively evaluate stalk lodging resistance traits.Through extensive field experiments and literature review,we identified and measured 74 phenotypic traits encompassing plant morphology,biomass,and anatomical characteristics in maize plants.Correlation analysis revealed a median linear correlation coefficient of 0.497 among these traits,with 15.1%of correlations exceeding 0.8.Principal component analysis showed that the first five components explained 90%of the total variance,indicating significant trait interactions.Through feature engineering and gradient boosting regression,we developed a high-precision wind speed-ear displacement prediction model(R^(2)=0.93)and identified 29 key traits critical for stalk lodging resistance.Sensitivity analysis revealed plant height as the most influential factor(sensitivity coefficient:−3.87),followed by traits of the 7th internode including epidermis layer thickness(0.62),pith area(−0.60),and lignin content(0.35).Our methodological framework not only provides quantitative insights into maize stalk lodging resistance mechanisms but also establishes a systematic approach for trait evaluation.The findings offer practical guidance for breeding programs focused on enhancing stalk lodging resistance and yield stability under climate change conditions,with potential applications in agronomic practice optimization and breeding strategy development.展开更多
The state of New York admitted 143 million metric tons of carbon emissions from fossil fuels in 2020,prompting the ambitious goal set by the CLCPA to achieve carbon neutrality.The paper focused on analyzing and predic...The state of New York admitted 143 million metric tons of carbon emissions from fossil fuels in 2020,prompting the ambitious goal set by the CLCPA to achieve carbon neutrality.The paper focused on analyzing and predicting carbon emissions using four different machine-learning algorithms.It examined emissions from fossil fuel combustion from 1990 to 2020 and validated four different algorithms to choose the most effective one for predicting emissions from 2020 to 2050.The analysis covered various economic sectors including transportation,residential,commer-cial,industrial,and electric power.By analyzing policies,the paper forecasted emissions for 2030 and 2050,leading to the identification of different pathways to reach carbon neutrality.The research concluded that in order to achieve neutrality,radical measures must be taken by the state of New York.Additionally,the paper compared the most recent data for 2021 with the forecasts,showing that significant measures need to be implemented to achieve the goal of carbon neutrality.Despite some studies assuming a trend of decreased emissions,the research revealed different results.The paper presents three pathways,two of which follow the ambitious plan to reach carbon neutrality.As a result,the emission amount by 2050 for the different pathways was projected to be 31.1,22.4,and 111.95 of MMt CO_(2) e,showcasing the need for urgent action to combat climate change.展开更多
A Trombe wall-heating system is used to absorb solar energy to heat buildings.Different parameters affect the system performance for optimal heating.This study evaluated the performance of four machine learning algori...A Trombe wall-heating system is used to absorb solar energy to heat buildings.Different parameters affect the system performance for optimal heating.This study evaluated the performance of four machine learning algorithms—linear regression,k-nearest neighbors,random forest,and decision tree—for predicting the room temperature in a Trombe wall system.The accuracy of the algorithms was assessed using R^(2)and root mean squared error(RMSE)values.The results demonstrated that the k-nearest neighbors and random forest algorithms exhibited superior performance,with R^(2)and RMSE values of 1 and 0.In contrast,linear regression and decision tree showed weaker performance.These findings highlight the potential of advanced machine learning algorithms for accurate room temperature prediction in Trombe wall systems,enabling informed design decisions to enhance energy efficiency.展开更多
In today's digital age,the popularity and development of online education systems provide students with more flexible and convenient ways of learning.However,students'adaptation to the online education system ...In today's digital age,the popularity and development of online education systems provide students with more flexible and convenient ways of learning.However,students'adaptation to the online education system is affected by a variety of factors,including gender,age,educational background,and field of specialisation.Through in-depth analyses and studies of these factors,the following conclusions can be drawn:gender has little influence on students'adaptation to online education,and male and female students perform similarly overall,but the proportion of male students at high adaptation levels is significantly higher than that of females.The majority of students show medium adaptability,indicating that the overall effect of online education is average.students in the age groups of 6-10,16-20 and 26-30 years old have lower adaptability levels,and there are more low adaptability groups among students in colleges and universities.students majoring in IT are more adapted to the online education system,and students not majoring in IT have relatively poorer adaptability level.Local students are more adaptable to online education than foreign students.In areas with unstable electricity,students'adaptability is usually lower.The decision tree algorithm predictions showed good overall model accuracy,with higher prediction accuracy for students with high,low and medium levels of adaptability.The test set accuracy was 93.27%,and the precision and recall were both 93.33%,indicating excellent model predictions.In summary,by deeply analysing the influence of various factors on students'adaptation degree to online education and using the random forest algorithm to make predictions,it can provide an important reference for improving the effectiveness of online education systems and provide useful insights for personalised education.展开更多
Edge Machine Learning(EdgeML)and Tiny Machine Learning(TinyML)are fast-growing fields that bring machine learning to resource-constrained devices,allowing real-time data processing and decision-making at the network’...Edge Machine Learning(EdgeML)and Tiny Machine Learning(TinyML)are fast-growing fields that bring machine learning to resource-constrained devices,allowing real-time data processing and decision-making at the network’s edge.However,the complexity of model conversion techniques,diverse inference mechanisms,and varied learning strategies make designing and deploying these models challenging.Additionally,deploying TinyML models on resource-constrained hardware with specific software frameworks has broadened EdgeML’s applications across various sectors.These factors underscore the necessity for a comprehensive literature review,as current reviews do not systematically encompass the most recent findings on these topics.Consequently,it provides a comprehensive overview of state-of-the-art techniques in model conversion,inference mechanisms,learning strategies within EdgeML,and deploying these models on resource-constrained edge devices using TinyML.It identifies 90 research articles published between 2018 and 2025,categorizing them into two main areas:(1)model conversion,inference,and learning strategies in EdgeML and(2)deploying TinyML models on resource-constrained hardware using specific software frameworks.In the first category,the synthesis of selected research articles compares and critically reviews various model conversion techniques,inference mechanisms,and learning strategies.In the second category,the synthesis identifies and elaborates on major development boards,software frameworks,sensors,and algorithms used in various applications across six major sectors.As a result,this article provides valuable insights for researchers,practitioners,and developers.It assists them in choosing suitable model conversion techniques,inference mechanisms,learning strategies,hardware development boards,software frameworks,sensors,and algorithms tailored to their specific needs and applications across various sectors.展开更多
BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intr...BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.AIM To evaluate the predictive performance of machine learning(ML)algorithms for DCI by comparing three modeling approaches,identify factors influencing DCI,and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.METHODS This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021.Demographic data,past medical history,medication use,and psychological status were collected.The endoscopist assessed DCI using the visual analogue scale.After univariate screening,predictive models were developed using multivariable logistic regression,least absolute shrinkage and selection operator(LASSO)regression,and random forest(RF)algorithms.Model performance was evaluated based on discrimination,calibration,and decision curve analysis(DCA),and results were visualized using nomograms.RESULTS A total of 712 patients(53.8%male;mean age 54.5 years±12.9 years)were included.Logistic regression analysis identified constipation[odds ratio(OR)=2.254,95%confidence interval(CI):1.289-3.931],abdominal circumference(AC)(77.5–91.9 cm,OR=1.895,95%CI:1.065-3.350;AC≥92 cm,OR=1.271,95%CI:0.730-2.188),and anxiety(OR=1.071,95%CI:1.044-1.100)as predictive factors for DCI,validated by LASSO and RF methods.Model performance revealed training/validation sensitivities of 0.826/0.925,0.924/0.868,and 1.000/0.981;specificities of 0.602/0.511,0.510/0.562,and 0.977/0.526;and corresponding area under the receiver operating characteristic curves(AUCs)of 0.780(0.737-0.823)/0.726(0.654-0.799),0.754(0.710-0.798)/0.723(0.656-0.791),and 1.000(1.000-1.000)/0.754(0.688-0.820),respectively.DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37.The RF model demonstrated superior diagnostic accuracy,reflected by perfect training sensitivity(1.000)and highest validation AUC(0.754),outperforming other methods in clinical applicability.CONCLUSION The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models.This approach supports individualized preoperative optimization,enhancing colonoscopy quality through targeted risk stratification.展开更多
The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficie...The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.展开更多
BACKGROUND Esophageal squamous cell carcinoma is a major histological subtype of esophageal cancer.Many molecular genetic changes are associated with its occurrence.Raman spectroscopy has become a new method for the e...BACKGROUND Esophageal squamous cell carcinoma is a major histological subtype of esophageal cancer.Many molecular genetic changes are associated with its occurrence.Raman spectroscopy has become a new method for the early diagnosis of tumors because it can reflect the structures of substances and their changes at the molecular level.AIM To detect alterations in Raman spectral information across different stages of esophageal neoplasia.METHODS Different grades of esophageal lesions were collected,and a total of 360 groups of Raman spectrum data were collected.A 1D-transformer network model was proposed to handle the task of classifying the spectral data of esophageal squamous cell carcinoma.In addition,a deep learning model was applied to visualize the Raman spectral data and interpret their molecular characteristics.RESULTS A comparison among Raman spectral data with different pathological grades and a visual analysis revealed that the Raman peaks with significant differences were concentrated mainly at 1095 cm^(-1)(DNA,symmetric PO,and stretching vibration),1132 cm^(-1)(cytochrome c),1171 cm^(-1)(acetoacetate),1216 cm^(-1)(amide III),and 1315 cm^(-1)(glycerol).A comparison among the training results of different models revealed that the 1Dtransformer network performed best.A 93.30%accuracy value,a 96.65%specificity value,a 93.30%sensitivity value,and a 93.17%F1 score were achieved.CONCLUSION Raman spectroscopy revealed significantly different waveforms for the different stages of esophageal neoplasia.The combination of Raman spectroscopy and deep learning methods could significantly improve the accuracy of classification.展开更多
Inverse reinforcement learning optimal control is under the framework of learner-expert.The learner system can imitate the expert system's demonstrated behaviors and does not require the predefined cost function,s...Inverse reinforcement learning optimal control is under the framework of learner-expert.The learner system can imitate the expert system's demonstrated behaviors and does not require the predefined cost function,so it can handle optimal control problems effectively.This paper proposes an inverse reinforcement learning optimal control method for Takagi-Sugeno(T-S)fuzzy systems.Based on learner systems,an expert system is constructed,where the learner system only knows the expert system's optimal control policy.To reconstruct the unknown cost function,we firstly develop a model-based inverse reinforcement learning algorithm for the case that systems dynamics are known.The developed model-based learning algorithm is consists of two learning stages:an inner reinforcement learning loop and an outer inverse optimal control loop.The inner loop desires to obtain optimal control policy via learner's cost function and the outer loop aims to update learner's state-penalty matrices via only using expert's optimal control policy.Then,to eliminate the requirement that the system dynamics must be known,a data-driven integral learning algorithm is presented.It is proved that the presented two algorithms are convergent and the developed inverse reinforcement learning optimal control scheme can ensure the controlled fuzzy learner systems to be asymptotically stable.Finally,we apply the proposed fuzzy optimal control to the truck-trailer system,and the computer simulation results verify the effectiveness of the presented approach.展开更多
基金financially supported by the National Natural Science Foundation of China(No.52073031)the National Key Research and Development Program of China(Nos.2023YFB3208102,2021YFB3200304)+4 种基金the China National Postdoctoral Program for Innovative Talents(No.BX2021302)the Beijing Nova Program(Nos.Z191100001119047,Z211100002121148)the Fundamental Research Funds for the Central Universities(No.E0EG6801X2)the‘Hundred Talents Program’of the Chinese Academy of Sciencesthe BrainLink program funded by the MSIT through the NRF of Korea(No.RS-2023-00237308).
文摘Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.
文摘Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.
基金supported by Shandong Provincial Natural Science Foundation(grant number:ZR2023MD036)Key Research and Development Project in Shandong Province(grant number:2019GGX101064)project for excellent youth foundation of the innovation teacher team,Shandong(grant number:2022KJ310)。
文摘The reasonable quantification of the concrete freezing environment on the Qinghai-Tibet Plateau(QTP)is the primary issue in frost resistant concrete design,which is one of the challenges that the QTP engineering managers should take into account.In this paper,we propose a more realistic method to calculate the number of concrete freeze-thaw cycles(NFTCs)on the QTP.The calculated results show that the NFTCs increase as the altitude of the meteorological station increases with the average NFTCs being 208.7.Four machine learning methods,i.e.,the random forest(RF)model,generalized boosting method(GBM),generalized linear model(GLM),and generalized additive model(GAM),are used to fit the NFTCs.The root mean square error(RMSE)values of the RF,GBM,GLM,and GAM are 32.3,4.3,247.9,and 161.3,respectively.The R^(2)values of the RF,GBM,GLM,and GAM are 0.93,0.99,0.48,and 0.66,respectively.The GBM method performs the best compared to the other three methods,which was shown by the results of RMSE and R^(2)values.The quantitative results from the GBM method indicate that the lowest,medium,and highest NFTC values are distributed in the northern,central,and southern parts of the QTP,respectively.The annual NFTCs in the QTP region are mainly concentrated at 160 and above,and the average NFTCs is 200 across the QTP.Our results can provide scientific guidance and a theoretical basis for the freezing resistance design of concrete in various projects on the QTP.
文摘The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artificial Neural Network(ANN),Quadratic Discriminant Analysis(QDA),Linear Discriminant Analysis(LDA),and Naive Bayes(NB),for landslide susceptibility modeling and comparison of their performances.Coupling machine learning algorithms with spatial data types for landslide susceptibility mapping is a vitally important issue.This study was carried out using GIS and R open source software at Abha Basin,Asir Region,Saudi Arabia.First,a total of 243 landslide locations were identified at Abha Basin to prepare the landslide inventory map using different data sources.All the landslide areas were randomly separated into two groups with a ratio of 70%for training and 30%for validating purposes.Twelve landslide-variables were generated for landslide susceptibility modeling,which include altitude,lithology,distance to faults,normalized difference vegetation index(NDVI),landuse/landcover(LULC),distance to roads,slope angle,distance to streams,profile curvature,plan curvature,slope length(LS),and slope-aspect.The area under curve(AUC-ROC)approach has been applied to evaluate,validate,and compare the MLTs performance.The results indicated that AUC values for seven MLTs range from 89.0%for QDA to 95.1%for RF.Our findings showed that the RF(AUC=95.1%)and LDA(AUC=941.7%)have produced the best performances in comparison to other MLTs.The outcome of this study and the landslide susceptibility maps would be useful for environmental protection.
基金supported by the Ministry of Science and Higher Education, Republic of Poland (Statutory Activity of the Central Mining Institute, Grant No. 11133010)
文摘The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as the coal seam tendency to rockbursts, the thickness of the coal seam, and the stress level in the seam have to be considered, but also the entire coal seam-surrounding rock system has to be evaluated when trying to predict the rockbursts. However, in hard coal mines, there are stroke or stress-stroke rockbursts in which the fracture of a thick layer of sandstone plays an essential role in predicting rockbursts. The occurrence of rockbursts in coal mines is complex, and their prediction is even more difficult than in other mines. In recent years, the interest in machine learning algorithms for solving complex nonlinear problems has increased, which also applies to geosciences. This study attempts to use machine learning algorithms, i.e. neural network, decision tree, random forest, gradient boosting, and extreme gradient boosting(XGB), to assess the rockburst hazard of an active hard coal mine in the Upper Silesian Coal Basin. The rock mass bursting tendency index WTGthat describes the tendency of the seam-surrounding rock system to rockbursts and the anomaly of the vertical stress component were applied for this purpose. Especially, the decision tree and neural network models were proved to be effective in correctly distinguishing rockbursts from tremors, after which the excavation was not damaged. On average, these models correctly classified about 80% of the rockbursts in the testing datasets.
文摘Big data analytic techniques associated with machine learning algorithms are playing an increasingly important role in various application fields,including stock market investment.However,few studies have focused on forecasting daily stock market returns,especially when using powerful machine learning techniques,such as deep neural networks(DNNs),to perform the analyses.DNNs employ various deep learning algorithms based on the combination of network structure,activation function,and model parameters,with their performance depending on the format of the data representation.This paper presents a comprehensive big data analytics process to predict the daily return direction of the SPDR S&P 500 ETF(ticker symbol:SPY)based on 60 financial and economic features.DNNs and traditional artificial neural networks(ANNs)are then deployed over the entire preprocessed but untransformed dataset,along with two datasets transformed via principal component analysis(PCA),to predict the daily direction of future stock market index returns.While controlling for overfitting,a pattern for the classification accuracy of the DNNs is detected and demonstrated as the number of the hidden layers increases gradually from 12 to 1000.Moreover,a set of hypothesis testing procedures are implemented on the classification,and the simulation results show that the DNNs using two PCA-represented datasets give significantly higher classification accuracy than those using the entire untransformed dataset,as well as several other hybrid machine learning algorithms.In addition,the trading strategies guided by the DNN classification process based on PCA-represented data perform slightly better than the others tested,including in a comparison against two standard benchmarks.
文摘Some countries have announced national benchmark rates,while others have been working on the recent trend in which the London Interbank Offered Rate will be retired at the end of 2021.Considering that Turkey announced the Turkish Lira Overnight Reference Interest Rate(TLREF),this study examines the determinants of TLREF.In this context,three global determinants,five country-level macroeconomic determinants,and the COVID-19 pandemic are considered by using daily data between December 28,2018,and December 31,2020,by performing machine learning algorithms and Ordinary Least Square.The empirical results show that(1)the most significant determinant is the amount of securities bought by Central Banks;(2)country-level macroeconomic factors have a higher impact whereas global factors are less important,and the pandemic does not have a significant effect;(3)Random Forest is the most accurate prediction model.Taking action by considering the study’s findings can help support economic growth by achieving low-level benchmark rates.
文摘This study aims to empirically analyze teaching-learning-based optimization(TLBO)and machine learning algorithms using k-means and fuzzy c-means(FCM)algorithms for their individual performance evaluation in terms of clustering and classification.In the first phase,the clustering(k-means and FCM)algorithms were employed independently and the clustering accuracy was evaluated using different computationalmeasures.During the second phase,the non-clustered data obtained from the first phase were preprocessed with TLBO.TLBO was performed using k-means(TLBO-KM)and FCM(TLBO-FCM)(TLBO-KM/FCM)algorithms.The objective function was determined by considering both minimization and maximization criteria.Non-clustered data obtained from the first phase were further utilized and fed as input for threshold optimization.Five benchmark datasets were considered from theUniversity of California,Irvine(UCI)Machine Learning Repository for comparative study and experimentation.These are breast cancer Wisconsin(BCW),Pima Indians Diabetes,Heart-Statlog,Hepatitis,and Cleveland Heart Disease datasets.The combined average accuracy obtained collectively is approximately 99.4%in case of TLBO-KM and 98.6%in case of TLBOFCM.This approach is also capable of finding the dominating attributes.The findings indicate that TLBO-KM/FCM,considering different computational measures,perform well on the non-clustered data where k-means and FCM,if employed independently,fail to provide significant results.Evaluating different feature sets,the TLBO-KM/FCM and SVM(GS)clearly outperformed all other classifiers in terms of sensitivity,specificity and accuracy.TLBOKM/FCM attained the highest average sensitivity(98.7%),highest average specificity(98.4%)and highest average accuracy(99.4%)for 10-fold cross validation with different test data.
文摘Cryptocurrency price prediction has garnered significant attention due to the growing importance of digital assets in the financial landscape. This paper presents a comprehensive study on predicting future cryptocurrency prices using machine learning algorithms. Open-source historical data from various cryptocurrency exchanges is utilized. Interpolation techniques are employed to handle missing data, ensuring the completeness and reliability of the dataset. Four technical indicators are selected as features for prediction. The study explores the application of five machine learning algorithms to capture the complex patterns in the highly volatile cryptocurrency market. The findings demonstrate the strengths and limitations of the different approaches, highlighting the significance of feature engineering and algorithm selection in achieving accurate cryptocurrency price predictions. The research contributes valuable insights into the dynamic and rapidly evolving field of cryptocurrency price prediction, assisting investors and traders in making informed decisions amidst the challenges posed by the cryptocurrency market.
基金supported by the College of Agriculture,Shiraz University(Grant No.97GRC1M271143)funding from the UK Biotechnology and Biological Sciences Research Council(BBSRC)funded by BBSRC grant award BBS/E/C/000I0330–Soil to Nutrition project 3–Sustainable intensification:optimisation at multiple scales。
文摘This investigation assessed the efficacy of 10 widely used machine learning algorithms(MLA)comprising the least absolute shrinkage and selection operator(LASSO),generalized linear model(GLM),stepwise generalized linear model(SGLM),elastic net(ENET),partial least square(PLS),ridge regression,support vector machine(SVM),classification and regression trees(CART),bagged CART,and random forest(RF)for gully erosion susceptibility mapping(GESM)in Iran.The location of 462 previously existing gully erosion sites were mapped through widespread field investigations,of which 70%(323)and 30%(139)of observations were arbitrarily divided for algorithm calibration and validation.Twelve controlling factors for gully erosion,namely,soil texture,annual mean rainfall,digital elevation model(DEM),drainage density,slope,lithology,topographic wetness index(TWI),distance from rivers,aspect,distance from roads,plan curvature,and profile curvature were ranked in terms of their importance using each MLA.The MLA were compared using a training dataset for gully erosion and statistical measures such as RMSE(root mean square error),MAE(mean absolute error),and R-squared.Based on the comparisons among MLA,the RF algorithm exhibited the minimum RMSE and MAE and the maximum value of R-squared,and was therefore selected as the best model.The variable importance evaluation using the RF model revealed that distance from rivers had the highest significance in influencing the occurrence of gully erosion whereas plan curvature had the least importance.According to the GESM generated using RF,most of the study area is predicted to have a low(53.72%)or moderate(29.65%)susceptibility to gully erosion,whereas only a small area is identified to have a high(12.56%)or very high(4.07%)susceptibility.The outcome generated by RF model is validated using the ROC(Receiver Operating Characteristics)curve approach,which returned an area under the curve(AUC)of 0.985,proving the excellent forecasting ability of the model.The GESM prepared using the RF algorithm can aid decision-makers in targeting remedial actions for minimizing the damage caused by gully erosion.
基金funded by the Major Program of Social Science Foundation of Tianjin Municipal Education Commission(2019JWZD53).
文摘Periodontitis is closely related to many systemic diseases linked by different periodontal pathogens.To unravel the relationship between periodontitis and systemic diseases,it is very important to correctly discriminate major periodontal pathogens.To realize convenient,effcient,and high-accuracy bacterial species classification,the authors use Raman spectroscopy combined with machine learning algorithms to distinguish three major periodontal pathogens Porphyromonas gingivalis(Pg),Fusobacterium nucleatum(Fn),and Aggregatibacter actinomycetemcomitans(Aa).The result shows that this novel method can successfully discriminate the three abovementioned periodontal pathogens.Moreover,the classification accuracies for the three categories of the original data were 94.7%at the sample level and 93.9%at the spectrum level by the machine learning algorithm extra trees.This study provides a fast,simple,and accurate method which is very beneficial to differentiate periodontal pathogens.
基金funded by the State Key Program of National Natural Science Foundation of China(32330075)Construction of Collaborative Innovation Center of Beijing Academy of Agricultural and Forestry Sciences(KJCX20240406)Beijing Academy of Agriculture and Forestry Sciences Foundation for Post-doctoral Scientists(2020-ZZ-014).
文摘Climate change has intensified maize stalk lodging,severely impacting global maize production.While numerous traits influence stalk lodging resistance,their relative importance remains unclear,hindering breeding efforts.This study introduces an combining wind tunnel testing with machine learning algorithms to quantitatively evaluate stalk lodging resistance traits.Through extensive field experiments and literature review,we identified and measured 74 phenotypic traits encompassing plant morphology,biomass,and anatomical characteristics in maize plants.Correlation analysis revealed a median linear correlation coefficient of 0.497 among these traits,with 15.1%of correlations exceeding 0.8.Principal component analysis showed that the first five components explained 90%of the total variance,indicating significant trait interactions.Through feature engineering and gradient boosting regression,we developed a high-precision wind speed-ear displacement prediction model(R^(2)=0.93)and identified 29 key traits critical for stalk lodging resistance.Sensitivity analysis revealed plant height as the most influential factor(sensitivity coefficient:−3.87),followed by traits of the 7th internode including epidermis layer thickness(0.62),pith area(−0.60),and lignin content(0.35).Our methodological framework not only provides quantitative insights into maize stalk lodging resistance mechanisms but also establishes a systematic approach for trait evaluation.The findings offer practical guidance for breeding programs focused on enhancing stalk lodging resistance and yield stability under climate change conditions,with potential applications in agronomic practice optimization and breeding strategy development.
基金supported by the National Natural Science Foundation of China(51534005,51278293and 51178262)China Environmental Protection Foundation.
文摘The state of New York admitted 143 million metric tons of carbon emissions from fossil fuels in 2020,prompting the ambitious goal set by the CLCPA to achieve carbon neutrality.The paper focused on analyzing and predicting carbon emissions using four different machine-learning algorithms.It examined emissions from fossil fuel combustion from 1990 to 2020 and validated four different algorithms to choose the most effective one for predicting emissions from 2020 to 2050.The analysis covered various economic sectors including transportation,residential,commer-cial,industrial,and electric power.By analyzing policies,the paper forecasted emissions for 2030 and 2050,leading to the identification of different pathways to reach carbon neutrality.The research concluded that in order to achieve neutrality,radical measures must be taken by the state of New York.Additionally,the paper compared the most recent data for 2021 with the forecasts,showing that significant measures need to be implemented to achieve the goal of carbon neutrality.Despite some studies assuming a trend of decreased emissions,the research revealed different results.The paper presents three pathways,two of which follow the ambitious plan to reach carbon neutrality.As a result,the emission amount by 2050 for the different pathways was projected to be 31.1,22.4,and 111.95 of MMt CO_(2) e,showcasing the need for urgent action to combat climate change.
文摘A Trombe wall-heating system is used to absorb solar energy to heat buildings.Different parameters affect the system performance for optimal heating.This study evaluated the performance of four machine learning algorithms—linear regression,k-nearest neighbors,random forest,and decision tree—for predicting the room temperature in a Trombe wall system.The accuracy of the algorithms was assessed using R^(2)and root mean squared error(RMSE)values.The results demonstrated that the k-nearest neighbors and random forest algorithms exhibited superior performance,with R^(2)and RMSE values of 1 and 0.In contrast,linear regression and decision tree showed weaker performance.These findings highlight the potential of advanced machine learning algorithms for accurate room temperature prediction in Trombe wall systems,enabling informed design decisions to enhance energy efficiency.
文摘In today's digital age,the popularity and development of online education systems provide students with more flexible and convenient ways of learning.However,students'adaptation to the online education system is affected by a variety of factors,including gender,age,educational background,and field of specialisation.Through in-depth analyses and studies of these factors,the following conclusions can be drawn:gender has little influence on students'adaptation to online education,and male and female students perform similarly overall,but the proportion of male students at high adaptation levels is significantly higher than that of females.The majority of students show medium adaptability,indicating that the overall effect of online education is average.students in the age groups of 6-10,16-20 and 26-30 years old have lower adaptability levels,and there are more low adaptability groups among students in colleges and universities.students majoring in IT are more adapted to the online education system,and students not majoring in IT have relatively poorer adaptability level.Local students are more adaptable to online education than foreign students.In areas with unstable electricity,students'adaptability is usually lower.The decision tree algorithm predictions showed good overall model accuracy,with higher prediction accuracy for students with high,low and medium levels of adaptability.The test set accuracy was 93.27%,and the precision and recall were both 93.33%,indicating excellent model predictions.In summary,by deeply analysing the influence of various factors on students'adaptation degree to online education and using the random forest algorithm to make predictions,it can provide an important reference for improving the effectiveness of online education systems and provide useful insights for personalised education.
文摘Edge Machine Learning(EdgeML)and Tiny Machine Learning(TinyML)are fast-growing fields that bring machine learning to resource-constrained devices,allowing real-time data processing and decision-making at the network’s edge.However,the complexity of model conversion techniques,diverse inference mechanisms,and varied learning strategies make designing and deploying these models challenging.Additionally,deploying TinyML models on resource-constrained hardware with specific software frameworks has broadened EdgeML’s applications across various sectors.These factors underscore the necessity for a comprehensive literature review,as current reviews do not systematically encompass the most recent findings on these topics.Consequently,it provides a comprehensive overview of state-of-the-art techniques in model conversion,inference mechanisms,learning strategies within EdgeML,and deploying these models on resource-constrained edge devices using TinyML.It identifies 90 research articles published between 2018 and 2025,categorizing them into two main areas:(1)model conversion,inference,and learning strategies in EdgeML and(2)deploying TinyML models on resource-constrained hardware using specific software frameworks.In the first category,the synthesis of selected research articles compares and critically reviews various model conversion techniques,inference mechanisms,and learning strategies.In the second category,the synthesis identifies and elaborates on major development boards,software frameworks,sensors,and algorithms used in various applications across six major sectors.As a result,this article provides valuable insights for researchers,practitioners,and developers.It assists them in choosing suitable model conversion techniques,inference mechanisms,learning strategies,hardware development boards,software frameworks,sensors,and algorithms tailored to their specific needs and applications across various sectors.
基金the Chinese Clinical Trial Registry(No.ChiCTR2000040109)approved by the Hospital Ethics Committee(No.20210130017).
文摘BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.AIM To evaluate the predictive performance of machine learning(ML)algorithms for DCI by comparing three modeling approaches,identify factors influencing DCI,and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.METHODS This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021.Demographic data,past medical history,medication use,and psychological status were collected.The endoscopist assessed DCI using the visual analogue scale.After univariate screening,predictive models were developed using multivariable logistic regression,least absolute shrinkage and selection operator(LASSO)regression,and random forest(RF)algorithms.Model performance was evaluated based on discrimination,calibration,and decision curve analysis(DCA),and results were visualized using nomograms.RESULTS A total of 712 patients(53.8%male;mean age 54.5 years±12.9 years)were included.Logistic regression analysis identified constipation[odds ratio(OR)=2.254,95%confidence interval(CI):1.289-3.931],abdominal circumference(AC)(77.5–91.9 cm,OR=1.895,95%CI:1.065-3.350;AC≥92 cm,OR=1.271,95%CI:0.730-2.188),and anxiety(OR=1.071,95%CI:1.044-1.100)as predictive factors for DCI,validated by LASSO and RF methods.Model performance revealed training/validation sensitivities of 0.826/0.925,0.924/0.868,and 1.000/0.981;specificities of 0.602/0.511,0.510/0.562,and 0.977/0.526;and corresponding area under the receiver operating characteristic curves(AUCs)of 0.780(0.737-0.823)/0.726(0.654-0.799),0.754(0.710-0.798)/0.723(0.656-0.791),and 1.000(1.000-1.000)/0.754(0.688-0.820),respectively.DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37.The RF model demonstrated superior diagnostic accuracy,reflected by perfect training sensitivity(1.000)and highest validation AUC(0.754),outperforming other methods in clinical applicability.CONCLUSION The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models.This approach supports individualized preoperative optimization,enhancing colonoscopy quality through targeted risk stratification.
基金supported by the National Key Research and Development Program of China(2023YFB3307801)the National Natural Science Foundation of China(62394343,62373155,62073142)+3 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)the Programme of Introducing Talents of Discipline to Universities(the 111 Project)under Grant B17017the Fundamental Research Funds for the Central Universities,Science Foundation of China University of Petroleum,Beijing(No.2462024YJRC011)the Open Research Project of the State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024B70).
文摘The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.
基金Supported by Beijing Hospitals Authority Youth Programme,No.QML20200505.
文摘BACKGROUND Esophageal squamous cell carcinoma is a major histological subtype of esophageal cancer.Many molecular genetic changes are associated with its occurrence.Raman spectroscopy has become a new method for the early diagnosis of tumors because it can reflect the structures of substances and their changes at the molecular level.AIM To detect alterations in Raman spectral information across different stages of esophageal neoplasia.METHODS Different grades of esophageal lesions were collected,and a total of 360 groups of Raman spectrum data were collected.A 1D-transformer network model was proposed to handle the task of classifying the spectral data of esophageal squamous cell carcinoma.In addition,a deep learning model was applied to visualize the Raman spectral data and interpret their molecular characteristics.RESULTS A comparison among Raman spectral data with different pathological grades and a visual analysis revealed that the Raman peaks with significant differences were concentrated mainly at 1095 cm^(-1)(DNA,symmetric PO,and stretching vibration),1132 cm^(-1)(cytochrome c),1171 cm^(-1)(acetoacetate),1216 cm^(-1)(amide III),and 1315 cm^(-1)(glycerol).A comparison among the training results of different models revealed that the 1Dtransformer network performed best.A 93.30%accuracy value,a 96.65%specificity value,a 93.30%sensitivity value,and a 93.17%F1 score were achieved.CONCLUSION Raman spectroscopy revealed significantly different waveforms for the different stages of esophageal neoplasia.The combination of Raman spectroscopy and deep learning methods could significantly improve the accuracy of classification.
基金The National Natural Science Foundation of China(62173172).
文摘Inverse reinforcement learning optimal control is under the framework of learner-expert.The learner system can imitate the expert system's demonstrated behaviors and does not require the predefined cost function,so it can handle optimal control problems effectively.This paper proposes an inverse reinforcement learning optimal control method for Takagi-Sugeno(T-S)fuzzy systems.Based on learner systems,an expert system is constructed,where the learner system only knows the expert system's optimal control policy.To reconstruct the unknown cost function,we firstly develop a model-based inverse reinforcement learning algorithm for the case that systems dynamics are known.The developed model-based learning algorithm is consists of two learning stages:an inner reinforcement learning loop and an outer inverse optimal control loop.The inner loop desires to obtain optimal control policy via learner's cost function and the outer loop aims to update learner's state-penalty matrices via only using expert's optimal control policy.Then,to eliminate the requirement that the system dynamics must be known,a data-driven integral learning algorithm is presented.It is proved that the presented two algorithms are convergent and the developed inverse reinforcement learning optimal control scheme can ensure the controlled fuzzy learner systems to be asymptotically stable.Finally,we apply the proposed fuzzy optimal control to the truck-trailer system,and the computer simulation results verify the effectiveness of the presented approach.