CARE—Cloud Archive Repository Express has emerged from algorithmic machine learning, and acts like a “fastlane” to bridge between DATA and wiseCIO where DATA stands for digital archiving & trans-analyt...CARE—Cloud Archive Repository Express has emerged from algorithmic machine learning, and acts like a “fastlane” to bridge between DATA and wiseCIO where DATA stands for digital archiving & trans-analytics, and wiseCIO for web-based intelligent service. CARE incorporates DATA and wiseCIO into a triad for content management and delivery (CMD) to orchestrate Anything as a Service (XaaS) by using mathematical and computational solutions to cloud-based problems. This article presents algorithmic machine learning in CARE for “DNA-like” ingredients with trivial information eliminated through deep learning to support integral content management over DATA and informative delivery on wiseCIO. In particular with algorithmic machine learning, CARE creatively incorporates express tokens for information interchange (eTokin) to promote seamless intercommunications among the CMD triad that enables Anything as a Service and empowers ordinary users to be UNIQ professionals: such as ubiquitous manager on content management and delivery, novel designer on universal interface and user-centric experience, intelligent expert for business intelligence, and quinary liaison with XaaS without explicitly coding required. Furthermore, CMD triad harnesses rapid prototyping for user interface design and propels cohesive assembly from Anything orchestrated as a Service. More importantly, CARE collaboratively as a whole promotes instant publishing over DATA, efficient presentation to end-users via wiseCIO, and diligent intelligence for business, education, and entertainment (iBEE) through highly robotic process automation.展开更多
Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting...Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting flood resource variables using single or hybrid machine learning techniques.However,class-based flood predictions have rarely been investigated,which can aid in quickly diagnosing comprehensive flood characteristics and proposing targeted management strategies.This study proposed a prediction approach of flood regime metrics and event classes coupling machine learning algorithms with clustering-deduced membership degrees.Five algorithms were adopted for this exploration.Results showed that the class membership degrees accurately determined event classes with class hit rates up to 100%,compared with the four classes clustered from nine regime metrics.The nonlinear algorithms(Multiple Linear Regression,Random Forest,and least squares-Support Vector Machine)outperformed the linear techniques(Multiple Linear Regression and Stepwise Regression)in predicting flood regime metrics.The proposed approach well predicted flood event classes with average class hit rates of 66.0%-85.4%and 47.2%-76.0%in calibration and validation periods,respectively,particularly for the slow and late flood events.The predictive capability of the proposed prediction approach for flood regime metrics and classes was considerably stronger than that of hydrological modeling approach.展开更多
Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decisio...Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.展开更多
Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status ...Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.展开更多
BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intr...BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.AIM To evaluate the predictive performance of machine learning(ML)algorithms for DCI by comparing three modeling approaches,identify factors influencing DCI,and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.METHODS This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021.Demographic data,past medical history,medication use,and psychological status were collected.The endoscopist assessed DCI using the visual analogue scale.After univariate screening,predictive models were developed using multivariable logistic regression,least absolute shrinkage and selection operator(LASSO)regression,and random forest(RF)algorithms.Model performance was evaluated based on discrimination,calibration,and decision curve analysis(DCA),and results were visualized using nomograms.RESULTS A total of 712 patients(53.8%male;mean age 54.5 years±12.9 years)were included.Logistic regression analysis identified constipation[odds ratio(OR)=2.254,95%confidence interval(CI):1.289-3.931],abdominal circumference(AC)(77.5–91.9 cm,OR=1.895,95%CI:1.065-3.350;AC≥92 cm,OR=1.271,95%CI:0.730-2.188),and anxiety(OR=1.071,95%CI:1.044-1.100)as predictive factors for DCI,validated by LASSO and RF methods.Model performance revealed training/validation sensitivities of 0.826/0.925,0.924/0.868,and 1.000/0.981;specificities of 0.602/0.511,0.510/0.562,and 0.977/0.526;and corresponding area under the receiver operating characteristic curves(AUCs)of 0.780(0.737-0.823)/0.726(0.654-0.799),0.754(0.710-0.798)/0.723(0.656-0.791),and 1.000(1.000-1.000)/0.754(0.688-0.820),respectively.DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37.The RF model demonstrated superior diagnostic accuracy,reflected by perfect training sensitivity(1.000)and highest validation AUC(0.754),outperforming other methods in clinical applicability.CONCLUSION The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models.This approach supports individualized preoperative optimization,enhancing colonoscopy quality through targeted risk stratification.展开更多
Machine learning(ML)has strong potential for soil settlement prediction,but determining hyperparameters for ML models is often intricate and laborious.Therefore,we apply Bayesian optimization to determine the optimal ...Machine learning(ML)has strong potential for soil settlement prediction,but determining hyperparameters for ML models is often intricate and laborious.Therefore,we apply Bayesian optimization to determine the optimal hyperparameter combinations,enhancing the effectiveness of ML models for soil parameter inversion.The ML models are trained using numerical simulation data generated with the modified Cam-Clay(MCC)model in ABAQUS software,and their performance is evaluated using ground settlement monitoring data from an airport runway.Five optimized ML models—decision tree(DT),random forest(RF),support vector regression(SVR),deep neural network(DNN),and one-dimensional convolutional neural network(1D-CNN)—are compared in terms of their accuracy for soil parameter inversion and settlement prediction.The results indicate that Bayesian optimization efficiently utilizes prior knowledge to identify the optimal hyperparameters,significantly improving model performance.Among the evaluated models,the 1D-CNN achieves the highest accuracy in soil parameter inversion,generating settlement predictions that closely match real monitoring data.These findings demonstrate the effectiveness of the proposed approach for soil parameter inversion and settlement prediction,and reveal how Bayesian optimization can refine the model selection process.展开更多
Due to the combined influences such as ore-forming temperature,fluid and metal sources,sphalerite tends to incorporate diverse contents of trace elements during the formation of different types of Lead-zinc(Pb-Zn)depo...Due to the combined influences such as ore-forming temperature,fluid and metal sources,sphalerite tends to incorporate diverse contents of trace elements during the formation of different types of Lead-zinc(Pb-Zn)deposits.Therefore,trace elements in sphalerite have long been utilized to distinguish Pb-Zn deposit types.However,previous discriminant diagrams usually contain two or three dimensions,which are limited to revealing the complicated interrelations between trace elements of sphalerite and the types of Pb-Zn deposits.In this study,we aim to prove that the sphalerite trace elements can be used to classify the Pb-Zn deposit types and extract key factors from sphalerite trace elements that can dis-criminate Pb-Zn deposit types using machine learning algorithms.A dataset of nearly 3600 sphalerite spot analyses from 95 Pb-Zn deposits worldwide determined by LA-ICP-MS was compiled from peer-reviewed publications,containing 12 elements(Mn,Fe,Co,Cu,Ga,Ge,Ag,Cd,In,Sn,Sb,and Pb)from 5 types,including Sedimentary Exhalative(SEDEX),Mississippi Valley Type(MVT),Volcanic Massive Sulfide(VMS),skarn,and epithermal deposits.Random Forests(RF)is applied to the data processing and the results show that trace elements of sphalerite can successfully discriminate different types of Pb-Zn deposits except for VMS deposits,most of which are falsely distinguished as skarn and epithermal types.To further discriminate VMS deposits,future studies could focus on enlarging the capacity of VMS deposits in datasets and applying other geological factors along with sphalerite trace elements when con-structing the classification model.RF’s feature importance and permutation feature importance were adopted to evaluate the element significance for classification.Besides,a visualized tool,t-distributed stochastic neighbor embedding(t-SNE),was used to verify the results of both classification and evalua-tion.The results presented here show that Mn,Co,and Ge display significant impacts on classification of Pb-Zn deposits and In,Ga,Sn,Cd,and Fe also have relatively important effects compared to the rest ele-ments,confirming that Pb-Zn deposits discrimination is mainly controlled by multi-elements in spha-lerite.Our study hence shows that machine learning algorithm can provide new insights into conventional geochemical analyses,inspiring future research on constructing classification models of mineral deposits using mineral geochemistry data.展开更多
The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artifici...The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artificial Neural Network(ANN),Quadratic Discriminant Analysis(QDA),Linear Discriminant Analysis(LDA),and Naive Bayes(NB),for landslide susceptibility modeling and comparison of their performances.Coupling machine learning algorithms with spatial data types for landslide susceptibility mapping is a vitally important issue.This study was carried out using GIS and R open source software at Abha Basin,Asir Region,Saudi Arabia.First,a total of 243 landslide locations were identified at Abha Basin to prepare the landslide inventory map using different data sources.All the landslide areas were randomly separated into two groups with a ratio of 70%for training and 30%for validating purposes.Twelve landslide-variables were generated for landslide susceptibility modeling,which include altitude,lithology,distance to faults,normalized difference vegetation index(NDVI),landuse/landcover(LULC),distance to roads,slope angle,distance to streams,profile curvature,plan curvature,slope length(LS),and slope-aspect.The area under curve(AUC-ROC)approach has been applied to evaluate,validate,and compare the MLTs performance.The results indicated that AUC values for seven MLTs range from 89.0%for QDA to 95.1%for RF.Our findings showed that the RF(AUC=95.1%)and LDA(AUC=941.7%)have produced the best performances in comparison to other MLTs.The outcome of this study and the landslide susceptibility maps would be useful for environmental protection.展开更多
The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as ...The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as the coal seam tendency to rockbursts, the thickness of the coal seam, and the stress level in the seam have to be considered, but also the entire coal seam-surrounding rock system has to be evaluated when trying to predict the rockbursts. However, in hard coal mines, there are stroke or stress-stroke rockbursts in which the fracture of a thick layer of sandstone plays an essential role in predicting rockbursts. The occurrence of rockbursts in coal mines is complex, and their prediction is even more difficult than in other mines. In recent years, the interest in machine learning algorithms for solving complex nonlinear problems has increased, which also applies to geosciences. This study attempts to use machine learning algorithms, i.e. neural network, decision tree, random forest, gradient boosting, and extreme gradient boosting(XGB), to assess the rockburst hazard of an active hard coal mine in the Upper Silesian Coal Basin. The rock mass bursting tendency index WTGthat describes the tendency of the seam-surrounding rock system to rockbursts and the anomaly of the vertical stress component were applied for this purpose. Especially, the decision tree and neural network models were proved to be effective in correctly distinguishing rockbursts from tremors, after which the excavation was not damaged. On average, these models correctly classified about 80% of the rockbursts in the testing datasets.展开更多
This investigation assessed the efficacy of 10 widely used machine learning algorithms(MLA)comprising the least absolute shrinkage and selection operator(LASSO),generalized linear model(GLM),stepwise generalized linea...This investigation assessed the efficacy of 10 widely used machine learning algorithms(MLA)comprising the least absolute shrinkage and selection operator(LASSO),generalized linear model(GLM),stepwise generalized linear model(SGLM),elastic net(ENET),partial least square(PLS),ridge regression,support vector machine(SVM),classification and regression trees(CART),bagged CART,and random forest(RF)for gully erosion susceptibility mapping(GESM)in Iran.The location of 462 previously existing gully erosion sites were mapped through widespread field investigations,of which 70%(323)and 30%(139)of observations were arbitrarily divided for algorithm calibration and validation.Twelve controlling factors for gully erosion,namely,soil texture,annual mean rainfall,digital elevation model(DEM),drainage density,slope,lithology,topographic wetness index(TWI),distance from rivers,aspect,distance from roads,plan curvature,and profile curvature were ranked in terms of their importance using each MLA.The MLA were compared using a training dataset for gully erosion and statistical measures such as RMSE(root mean square error),MAE(mean absolute error),and R-squared.Based on the comparisons among MLA,the RF algorithm exhibited the minimum RMSE and MAE and the maximum value of R-squared,and was therefore selected as the best model.The variable importance evaluation using the RF model revealed that distance from rivers had the highest significance in influencing the occurrence of gully erosion whereas plan curvature had the least importance.According to the GESM generated using RF,most of the study area is predicted to have a low(53.72%)or moderate(29.65%)susceptibility to gully erosion,whereas only a small area is identified to have a high(12.56%)or very high(4.07%)susceptibility.The outcome generated by RF model is validated using the ROC(Receiver Operating Characteristics)curve approach,which returned an area under the curve(AUC)of 0.985,proving the excellent forecasting ability of the model.The GESM prepared using the RF algorithm can aid decision-makers in targeting remedial actions for minimizing the damage caused by gully erosion.展开更多
Due to the development of the novel materials,the past two decades have witnessed the rapid advances of soft electronics.The soft electronics have huge potential in the physical sign monitoring and health care.One of ...Due to the development of the novel materials,the past two decades have witnessed the rapid advances of soft electronics.The soft electronics have huge potential in the physical sign monitoring and health care.One of the important advantages of soft electronics is forming good interface with skin,which can increase the user scale and improve the signal quality.Therefore,it is easy to build the specific dataset,which is important to improve the performance of machine learning algorithm.At the same time,with the assistance of machine learning algorithm,the soft electronics have become more and more intelligent to realize real-time analysis and diagnosis.The soft electronics and machining learning algorithms complement each other very well.It is indubitable that the soft electronics will bring us to a healthier and more intelligent world in the near future.Therefore,in this review,we will give a careful introduction about the new soft material,physiological signal detected by soft devices,and the soft devices assisted by machine learning algorithm.Some soft materials will be discussed such as two-dimensional material,carbon nanotube,nanowire,nanomesh,and hydrogel.Then,soft sensors will be discussed according to the physiological signal types(pulse,respiration,human motion,intraocular pressure,phonation,etc.).After that,the soft electronics assisted by various algorithms will be reviewed,including some classical algorithms and powerful neural network algorithms.Especially,the soft device assisted by neural network will be introduced carefully.Finally,the outlook,challenge,and conclusion of soft system powered by machine learning algorithm will be discussed.展开更多
Big data analytic techniques associated with machine learning algorithms are playing an increasingly important role in various application fields,including stock market investment.However,few studies have focused on f...Big data analytic techniques associated with machine learning algorithms are playing an increasingly important role in various application fields,including stock market investment.However,few studies have focused on forecasting daily stock market returns,especially when using powerful machine learning techniques,such as deep neural networks(DNNs),to perform the analyses.DNNs employ various deep learning algorithms based on the combination of network structure,activation function,and model parameters,with their performance depending on the format of the data representation.This paper presents a comprehensive big data analytics process to predict the daily return direction of the SPDR S&P 500 ETF(ticker symbol:SPY)based on 60 financial and economic features.DNNs and traditional artificial neural networks(ANNs)are then deployed over the entire preprocessed but untransformed dataset,along with two datasets transformed via principal component analysis(PCA),to predict the daily direction of future stock market index returns.While controlling for overfitting,a pattern for the classification accuracy of the DNNs is detected and demonstrated as the number of the hidden layers increases gradually from 12 to 1000.Moreover,a set of hypothesis testing procedures are implemented on the classification,and the simulation results show that the DNNs using two PCA-represented datasets give significantly higher classification accuracy than those using the entire untransformed dataset,as well as several other hybrid machine learning algorithms.In addition,the trading strategies guided by the DNN classification process based on PCA-represented data perform slightly better than the others tested,including in a comparison against two standard benchmarks.展开更多
Periodontitis is closely related to many systemic diseases linked by different periodontal pathogens.To unravel the relationship between periodontitis and systemic diseases,it is very important to correctly discrimina...Periodontitis is closely related to many systemic diseases linked by different periodontal pathogens.To unravel the relationship between periodontitis and systemic diseases,it is very important to correctly discriminate major periodontal pathogens.To realize convenient,effcient,and high-accuracy bacterial species classification,the authors use Raman spectroscopy combined with machine learning algorithms to distinguish three major periodontal pathogens Porphyromonas gingivalis(Pg),Fusobacterium nucleatum(Fn),and Aggregatibacter actinomycetemcomitans(Aa).The result shows that this novel method can successfully discriminate the three abovementioned periodontal pathogens.Moreover,the classification accuracies for the three categories of the original data were 94.7%at the sample level and 93.9%at the spectrum level by the machine learning algorithm extra trees.This study provides a fast,simple,and accurate method which is very beneficial to differentiate periodontal pathogens.展开更多
Model parameters estimation is a pivotal issue for runoff modeling in ungauged catchments.The nonlinear relationship between model parameters and catchment descriptors is a major obstacle for parameter regionalization...Model parameters estimation is a pivotal issue for runoff modeling in ungauged catchments.The nonlinear relationship between model parameters and catchment descriptors is a major obstacle for parameter regionalization,which is the most widely used approach.Runoff modeling was studied in 38 catchments located in the Yellow–Huai–Hai River Basin(YHHRB).The values of the Nash–Sutcliffe efficiency coefficient(NSE),coefficient of determination(R2),and percent bias(PBIAS)indicated the acceptable performance of the soil and water assessment tool(SWAT)model in the YHHRB.Nine descriptors belonging to the categories of climate,soil,vegetation,and topography were used to express the catchment characteristics related to the hydrological processes.The quantitative relationships between the parameters of the SWAT model and the catchment descriptors were analyzed by six regression-based models,including linear regression(LR)equations,support vector regression(SVR),random forest(RF),k-nearest neighbor(kNN),decision tree(DT),and radial basis function(RBF).Each of the 38 catchments was assumed to be an ungauged catchment in turn.Then,the parameters in each target catchment were estimated by the constructed regression models based on the remaining 37 donor catchments.Furthermore,the similaritybased regionalization scheme was used for comparison with the regression-based approach.The results indicated that the runoff with the highest accuracy was modeled by the SVR-based scheme in ungauged catchments.Compared with the traditional LR-based approach,the accuracy of the runoff modeling in ungauged catchments was improved by the machine learning algorithms because of the outstanding capability to deal with nonlinear relationships.The performances of different approaches were similar in humid regions,while the advantages of the machine learning techniques were more evident in arid regions.When the study area contained nested catchments,the best result was calculated with the similarity-based parameter regionalization scheme because of the high catchment density and short spatial distance.The new findings could improve flood forecasting and water resources planning in regions that lack observed data.展开更多
Some countries have announced national benchmark rates,while others have been working on the recent trend in which the London Interbank Offered Rate will be retired at the end of 2021.Considering that Turkey announced...Some countries have announced national benchmark rates,while others have been working on the recent trend in which the London Interbank Offered Rate will be retired at the end of 2021.Considering that Turkey announced the Turkish Lira Overnight Reference Interest Rate(TLREF),this study examines the determinants of TLREF.In this context,three global determinants,five country-level macroeconomic determinants,and the COVID-19 pandemic are considered by using daily data between December 28,2018,and December 31,2020,by performing machine learning algorithms and Ordinary Least Square.The empirical results show that(1)the most significant determinant is the amount of securities bought by Central Banks;(2)country-level macroeconomic factors have a higher impact whereas global factors are less important,and the pandemic does not have a significant effect;(3)Random Forest is the most accurate prediction model.Taking action by considering the study’s findings can help support economic growth by achieving low-level benchmark rates.展开更多
This study aims to empirically analyze teaching-learning-based optimization(TLBO)and machine learning algorithms using k-means and fuzzy c-means(FCM)algorithms for their individual performance evaluation in terms of c...This study aims to empirically analyze teaching-learning-based optimization(TLBO)and machine learning algorithms using k-means and fuzzy c-means(FCM)algorithms for their individual performance evaluation in terms of clustering and classification.In the first phase,the clustering(k-means and FCM)algorithms were employed independently and the clustering accuracy was evaluated using different computationalmeasures.During the second phase,the non-clustered data obtained from the first phase were preprocessed with TLBO.TLBO was performed using k-means(TLBO-KM)and FCM(TLBO-FCM)(TLBO-KM/FCM)algorithms.The objective function was determined by considering both minimization and maximization criteria.Non-clustered data obtained from the first phase were further utilized and fed as input for threshold optimization.Five benchmark datasets were considered from theUniversity of California,Irvine(UCI)Machine Learning Repository for comparative study and experimentation.These are breast cancer Wisconsin(BCW),Pima Indians Diabetes,Heart-Statlog,Hepatitis,and Cleveland Heart Disease datasets.The combined average accuracy obtained collectively is approximately 99.4%in case of TLBO-KM and 98.6%in case of TLBOFCM.This approach is also capable of finding the dominating attributes.The findings indicate that TLBO-KM/FCM,considering different computational measures,perform well on the non-clustered data where k-means and FCM,if employed independently,fail to provide significant results.Evaluating different feature sets,the TLBO-KM/FCM and SVM(GS)clearly outperformed all other classifiers in terms of sensitivity,specificity and accuracy.TLBOKM/FCM attained the highest average sensitivity(98.7%),highest average specificity(98.4%)and highest average accuracy(99.4%)for 10-fold cross validation with different test data.展开更多
The finite element(FE)-based simulation of welding characteristics was carried out to explore the relationship among welding assembly properties for the parallel T-shaped thin-walled parts of an antenna structure.The ...The finite element(FE)-based simulation of welding characteristics was carried out to explore the relationship among welding assembly properties for the parallel T-shaped thin-walled parts of an antenna structure.The effects of welding direction,clamping,fixture release time,fixed constraints,and welding sequences on these properties were analyzed,and the mapping relationship among welding characteristics was thoroughly examined.Different machine learning algorithms,including the generalized regression neural network(GRNN),wavelet neural network(WNN),and fuzzy neural network(FNN),are used to predict the multiple welding properties of thin-walled parts to mirror their variation trend and verify the correctness of the mapping relationship.Compared with those from GRNN and WNN,the maximum mean relative errors for the predicted values of deformation,temperature,and residual stress with FNN were less than 4.8%,1.4%,and 4.4%,respectively.These results indicate that FNN generated the best predicted welding characteristics.Analysis under various welding conditions also shows a mapping relationship among welding deformation,temperature,and residual stress over a period of time.This finding further provides a paramount basis for the control of welding assembly errors of an antenna structure in the future.展开更多
Objective:To study the application of a machine learning algorithm for predicting gestational diabetes mellitus(GDM)in early pregnancy.Methods:This study identified indicators related to GDM through a literature revie...Objective:To study the application of a machine learning algorithm for predicting gestational diabetes mellitus(GDM)in early pregnancy.Methods:This study identified indicators related to GDM through a literature review and expert discussion.Pregnant women who had attended medical institutions for an antenatal examination from November 2017 to August 2018 were selected for analysis,and the collected indicators were retrospectively analyzed.Based on Python,the indicators were classified and modeled using a random forest regression algorithm,and the performance of the prediction model was analyzed.Results:We obtained 4806 analyzable data from 1625 pregnant women.Among these,3265 samples with all 67 indicators were used to establish data set F1;4806 samples with 38 identical indicators were used to establish data set F2.Each of F1 and F2 was used for training the random forest algorithm.The overall predictive accuracy of the F1 model was 93.10%,area under the receiver operating characteristic curve(AUC)was 0.66,and the predictive accuracy of GDM-positive cases was 37.10%.The corresponding values for the F2 model were 88.70%,0.87,and 79.44%.The results thus showed that the F2 prediction model performed better than the F1 model.To explore the impact of sacrificial indicators on GDM prediction,the F3 data set was established using 3265 samples(F1)with 38 indicators(F2).After training,the overall predictive accuracy of the F3 model was 91.60%,AUC was 0.58,and the predictive accuracy of positive cases was 15.85%.Conclusions:In this study,a model for predicting GDM with several input variables(e.g.,physical examination,past history,personal history,family history,and laboratory indicators)was established using a random forest regression algorithm.The trained prediction model exhibited a good performance and is valuable as a reference for predicting GDM in women at an early stage of pregnancy.In addition,there are cer tain requirements for the propor tions of negative and positive cases in sample data sets when the random forest algorithm is applied to the early prediction of GDM.展开更多
This article firstly explains the concepts of artificial intelligence and algorithm separately,then determines the research status of artificial intelligence and machine learning in the background of the increasing po...This article firstly explains the concepts of artificial intelligence and algorithm separately,then determines the research status of artificial intelligence and machine learning in the background of the increasing popularity of artificial intelligence,and finally briefly describes the machine learning algorithm in the field of artificial intelligence,as well as puts forward appropriate development prospects,in order to provide theoretical reference for industry insider.展开更多
This study examines the feasibility of using a machine learning approach for rapid damage assessment of rein-forced concrete(RC)buildings after the earthquake.Since the real-world damaged datasets are lacking,have lim...This study examines the feasibility of using a machine learning approach for rapid damage assessment of rein-forced concrete(RC)buildings after the earthquake.Since the real-world damaged datasets are lacking,have limited access,or are imbalanced,a simulation dataset is prepared by conducting a nonlinear time history analy-sis.Different machine learning(ML)models are trained considering the structural parameters and ground motion characteristics to predict the RC building damage into five categories:null,slight,moderate,heavy,and collapse.The random forest classifier(RFC)has achieved a higher prediction accuracy on testing and real-world damaged datasets.The structural parameters can be extracted using different means such as Google Earth,Open Street Map,unmanned aerial vehicles,etc.However,recording the ground motion at a closer distance requires the installation of a dense array of sensors which requires a higher cost.For places with no earthquake recording station/device,it is difficult to have ground motion characteristics.For that different ML-based regressor models are developed utilizing past-earthquake information to predict ground motion parameters such as peak ground acceleration and peak ground velocity.The random forest regressor(RFR)achieved better results than other regression models on testing and validation datasets.Furthermore,compared with the results of similar research works,a better result is obtained using RFC and RFR on validation datasets.In the end,these models are uti-lized to predict the damage categories of RC buildings at Saitama University and Okubo Danchi,Saitama,Japan after an earthquake.This damage information is crucial for government agencies or decision-makers to respond systematically in post-disaster situations.展开更多
文摘CARE—Cloud Archive Repository Express has emerged from algorithmic machine learning, and acts like a “fastlane” to bridge between DATA and wiseCIO where DATA stands for digital archiving & trans-analytics, and wiseCIO for web-based intelligent service. CARE incorporates DATA and wiseCIO into a triad for content management and delivery (CMD) to orchestrate Anything as a Service (XaaS) by using mathematical and computational solutions to cloud-based problems. This article presents algorithmic machine learning in CARE for “DNA-like” ingredients with trivial information eliminated through deep learning to support integral content management over DATA and informative delivery on wiseCIO. In particular with algorithmic machine learning, CARE creatively incorporates express tokens for information interchange (eTokin) to promote seamless intercommunications among the CMD triad that enables Anything as a Service and empowers ordinary users to be UNIQ professionals: such as ubiquitous manager on content management and delivery, novel designer on universal interface and user-centric experience, intelligent expert for business intelligence, and quinary liaison with XaaS without explicitly coding required. Furthermore, CMD triad harnesses rapid prototyping for user interface design and propels cohesive assembly from Anything orchestrated as a Service. More importantly, CARE collaboratively as a whole promotes instant publishing over DATA, efficient presentation to end-users via wiseCIO, and diligent intelligence for business, education, and entertainment (iBEE) through highly robotic process automation.
基金National Key Research and Development Program of China,No.2023YFC3006704National Natural Science Foundation of China,No.42171047CAS-CSIRO Partnership Joint Project of 2024,No.177GJHZ2023097MI。
文摘Accurate prediction of flood events is important for flood control and risk management.Machine learning techniques contributed greatly to advances in flood predictions,and existing studies mainly focused on predicting flood resource variables using single or hybrid machine learning techniques.However,class-based flood predictions have rarely been investigated,which can aid in quickly diagnosing comprehensive flood characteristics and proposing targeted management strategies.This study proposed a prediction approach of flood regime metrics and event classes coupling machine learning algorithms with clustering-deduced membership degrees.Five algorithms were adopted for this exploration.Results showed that the class membership degrees accurately determined event classes with class hit rates up to 100%,compared with the four classes clustered from nine regime metrics.The nonlinear algorithms(Multiple Linear Regression,Random Forest,and least squares-Support Vector Machine)outperformed the linear techniques(Multiple Linear Regression and Stepwise Regression)in predicting flood regime metrics.The proposed approach well predicted flood event classes with average class hit rates of 66.0%-85.4%and 47.2%-76.0%in calibration and validation periods,respectively,particularly for the slow and late flood events.The predictive capability of the proposed prediction approach for flood regime metrics and classes was considerably stronger than that of hydrological modeling approach.
基金financially supported by the National Natural Science Foundation of China(No.52073031)the National Key Research and Development Program of China(Nos.2023YFB3208102,2021YFB3200304)+4 种基金the China National Postdoctoral Program for Innovative Talents(No.BX2021302)the Beijing Nova Program(Nos.Z191100001119047,Z211100002121148)the Fundamental Research Funds for the Central Universities(No.E0EG6801X2)the‘Hundred Talents Program’of the Chinese Academy of Sciencesthe BrainLink program funded by the MSIT through the NRF of Korea(No.RS-2023-00237308).
文摘Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.
文摘Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.
基金the Chinese Clinical Trial Registry(No.ChiCTR2000040109)approved by the Hospital Ethics Committee(No.20210130017).
文摘BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.AIM To evaluate the predictive performance of machine learning(ML)algorithms for DCI by comparing three modeling approaches,identify factors influencing DCI,and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.METHODS This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021.Demographic data,past medical history,medication use,and psychological status were collected.The endoscopist assessed DCI using the visual analogue scale.After univariate screening,predictive models were developed using multivariable logistic regression,least absolute shrinkage and selection operator(LASSO)regression,and random forest(RF)algorithms.Model performance was evaluated based on discrimination,calibration,and decision curve analysis(DCA),and results were visualized using nomograms.RESULTS A total of 712 patients(53.8%male;mean age 54.5 years±12.9 years)were included.Logistic regression analysis identified constipation[odds ratio(OR)=2.254,95%confidence interval(CI):1.289-3.931],abdominal circumference(AC)(77.5–91.9 cm,OR=1.895,95%CI:1.065-3.350;AC≥92 cm,OR=1.271,95%CI:0.730-2.188),and anxiety(OR=1.071,95%CI:1.044-1.100)as predictive factors for DCI,validated by LASSO and RF methods.Model performance revealed training/validation sensitivities of 0.826/0.925,0.924/0.868,and 1.000/0.981;specificities of 0.602/0.511,0.510/0.562,and 0.977/0.526;and corresponding area under the receiver operating characteristic curves(AUCs)of 0.780(0.737-0.823)/0.726(0.654-0.799),0.754(0.710-0.798)/0.723(0.656-0.791),and 1.000(1.000-1.000)/0.754(0.688-0.820),respectively.DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37.The RF model demonstrated superior diagnostic accuracy,reflected by perfect training sensitivity(1.000)and highest validation AUC(0.754),outperforming other methods in clinical applicability.CONCLUSION The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models.This approach supports individualized preoperative optimization,enhancing colonoscopy quality through targeted risk stratification.
基金supported by the National Natural Science Foundation of China(Nos.52378419 and 52478368).
文摘Machine learning(ML)has strong potential for soil settlement prediction,but determining hyperparameters for ML models is often intricate and laborious.Therefore,we apply Bayesian optimization to determine the optimal hyperparameter combinations,enhancing the effectiveness of ML models for soil parameter inversion.The ML models are trained using numerical simulation data generated with the modified Cam-Clay(MCC)model in ABAQUS software,and their performance is evaluated using ground settlement monitoring data from an airport runway.Five optimized ML models—decision tree(DT),random forest(RF),support vector regression(SVR),deep neural network(DNN),and one-dimensional convolutional neural network(1D-CNN)—are compared in terms of their accuracy for soil parameter inversion and settlement prediction.The results indicate that Bayesian optimization efficiently utilizes prior knowledge to identify the optimal hyperparameters,significantly improving model performance.Among the evaluated models,the 1D-CNN achieves the highest accuracy in soil parameter inversion,generating settlement predictions that closely match real monitoring data.These findings demonstrate the effectiveness of the proposed approach for soil parameter inversion and settlement prediction,and reveal how Bayesian optimization can refine the model selection process.
基金We would like to acknowledge the financial support of the Ministry of Science and Technology of China(Grant No.2021YFC2900300)the National Natural Science Foundation of China(Grant Nos.41772074 and 42172103).
文摘Due to the combined influences such as ore-forming temperature,fluid and metal sources,sphalerite tends to incorporate diverse contents of trace elements during the formation of different types of Lead-zinc(Pb-Zn)deposits.Therefore,trace elements in sphalerite have long been utilized to distinguish Pb-Zn deposit types.However,previous discriminant diagrams usually contain two or three dimensions,which are limited to revealing the complicated interrelations between trace elements of sphalerite and the types of Pb-Zn deposits.In this study,we aim to prove that the sphalerite trace elements can be used to classify the Pb-Zn deposit types and extract key factors from sphalerite trace elements that can dis-criminate Pb-Zn deposit types using machine learning algorithms.A dataset of nearly 3600 sphalerite spot analyses from 95 Pb-Zn deposits worldwide determined by LA-ICP-MS was compiled from peer-reviewed publications,containing 12 elements(Mn,Fe,Co,Cu,Ga,Ge,Ag,Cd,In,Sn,Sb,and Pb)from 5 types,including Sedimentary Exhalative(SEDEX),Mississippi Valley Type(MVT),Volcanic Massive Sulfide(VMS),skarn,and epithermal deposits.Random Forests(RF)is applied to the data processing and the results show that trace elements of sphalerite can successfully discriminate different types of Pb-Zn deposits except for VMS deposits,most of which are falsely distinguished as skarn and epithermal types.To further discriminate VMS deposits,future studies could focus on enlarging the capacity of VMS deposits in datasets and applying other geological factors along with sphalerite trace elements when con-structing the classification model.RF’s feature importance and permutation feature importance were adopted to evaluate the element significance for classification.Besides,a visualized tool,t-distributed stochastic neighbor embedding(t-SNE),was used to verify the results of both classification and evalua-tion.The results presented here show that Mn,Co,and Ge display significant impacts on classification of Pb-Zn deposits and In,Ga,Sn,Cd,and Fe also have relatively important effects compared to the rest ele-ments,confirming that Pb-Zn deposits discrimination is mainly controlled by multi-elements in spha-lerite.Our study hence shows that machine learning algorithm can provide new insights into conventional geochemical analyses,inspiring future research on constructing classification models of mineral deposits using mineral geochemistry data.
文摘The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artificial Neural Network(ANN),Quadratic Discriminant Analysis(QDA),Linear Discriminant Analysis(LDA),and Naive Bayes(NB),for landslide susceptibility modeling and comparison of their performances.Coupling machine learning algorithms with spatial data types for landslide susceptibility mapping is a vitally important issue.This study was carried out using GIS and R open source software at Abha Basin,Asir Region,Saudi Arabia.First,a total of 243 landslide locations were identified at Abha Basin to prepare the landslide inventory map using different data sources.All the landslide areas were randomly separated into two groups with a ratio of 70%for training and 30%for validating purposes.Twelve landslide-variables were generated for landslide susceptibility modeling,which include altitude,lithology,distance to faults,normalized difference vegetation index(NDVI),landuse/landcover(LULC),distance to roads,slope angle,distance to streams,profile curvature,plan curvature,slope length(LS),and slope-aspect.The area under curve(AUC-ROC)approach has been applied to evaluate,validate,and compare the MLTs performance.The results indicated that AUC values for seven MLTs range from 89.0%for QDA to 95.1%for RF.Our findings showed that the RF(AUC=95.1%)and LDA(AUC=941.7%)have produced the best performances in comparison to other MLTs.The outcome of this study and the landslide susceptibility maps would be useful for environmental protection.
基金supported by the Ministry of Science and Higher Education, Republic of Poland (Statutory Activity of the Central Mining Institute, Grant No. 11133010)
文摘The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as the coal seam tendency to rockbursts, the thickness of the coal seam, and the stress level in the seam have to be considered, but also the entire coal seam-surrounding rock system has to be evaluated when trying to predict the rockbursts. However, in hard coal mines, there are stroke or stress-stroke rockbursts in which the fracture of a thick layer of sandstone plays an essential role in predicting rockbursts. The occurrence of rockbursts in coal mines is complex, and their prediction is even more difficult than in other mines. In recent years, the interest in machine learning algorithms for solving complex nonlinear problems has increased, which also applies to geosciences. This study attempts to use machine learning algorithms, i.e. neural network, decision tree, random forest, gradient boosting, and extreme gradient boosting(XGB), to assess the rockburst hazard of an active hard coal mine in the Upper Silesian Coal Basin. The rock mass bursting tendency index WTGthat describes the tendency of the seam-surrounding rock system to rockbursts and the anomaly of the vertical stress component were applied for this purpose. Especially, the decision tree and neural network models were proved to be effective in correctly distinguishing rockbursts from tremors, after which the excavation was not damaged. On average, these models correctly classified about 80% of the rockbursts in the testing datasets.
基金supported by the College of Agriculture,Shiraz University(Grant No.97GRC1M271143)funding from the UK Biotechnology and Biological Sciences Research Council(BBSRC)funded by BBSRC grant award BBS/E/C/000I0330–Soil to Nutrition project 3–Sustainable intensification:optimisation at multiple scales。
文摘This investigation assessed the efficacy of 10 widely used machine learning algorithms(MLA)comprising the least absolute shrinkage and selection operator(LASSO),generalized linear model(GLM),stepwise generalized linear model(SGLM),elastic net(ENET),partial least square(PLS),ridge regression,support vector machine(SVM),classification and regression trees(CART),bagged CART,and random forest(RF)for gully erosion susceptibility mapping(GESM)in Iran.The location of 462 previously existing gully erosion sites were mapped through widespread field investigations,of which 70%(323)and 30%(139)of observations were arbitrarily divided for algorithm calibration and validation.Twelve controlling factors for gully erosion,namely,soil texture,annual mean rainfall,digital elevation model(DEM),drainage density,slope,lithology,topographic wetness index(TWI),distance from rivers,aspect,distance from roads,plan curvature,and profile curvature were ranked in terms of their importance using each MLA.The MLA were compared using a training dataset for gully erosion and statistical measures such as RMSE(root mean square error),MAE(mean absolute error),and R-squared.Based on the comparisons among MLA,the RF algorithm exhibited the minimum RMSE and MAE and the maximum value of R-squared,and was therefore selected as the best model.The variable importance evaluation using the RF model revealed that distance from rivers had the highest significance in influencing the occurrence of gully erosion whereas plan curvature had the least importance.According to the GESM generated using RF,most of the study area is predicted to have a low(53.72%)or moderate(29.65%)susceptibility to gully erosion,whereas only a small area is identified to have a high(12.56%)or very high(4.07%)susceptibility.The outcome generated by RF model is validated using the ROC(Receiver Operating Characteristics)curve approach,which returned an area under the curve(AUC)of 0.985,proving the excellent forecasting ability of the model.The GESM prepared using the RF algorithm can aid decision-makers in targeting remedial actions for minimizing the damage caused by gully erosion.
基金supported by National Natural Science Foundation of China(No.62201624,32000939,21775168,22174167,51861145202,U20A20168)the Guangdong Basic and Applied Basic Research Foundation(2019A1515111183)+3 种基金Shenzhen Research Funding Program(JCYJ20190807160401657,JCYJ201908073000608,JCYJ20150831192224146)the National Key R&D Program(2018YFC2001202)the support of the Research Fund from Tsinghua University Initiative Scientific Research Programthe support from Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province(No.2020B1212060077)。
文摘Due to the development of the novel materials,the past two decades have witnessed the rapid advances of soft electronics.The soft electronics have huge potential in the physical sign monitoring and health care.One of the important advantages of soft electronics is forming good interface with skin,which can increase the user scale and improve the signal quality.Therefore,it is easy to build the specific dataset,which is important to improve the performance of machine learning algorithm.At the same time,with the assistance of machine learning algorithm,the soft electronics have become more and more intelligent to realize real-time analysis and diagnosis.The soft electronics and machining learning algorithms complement each other very well.It is indubitable that the soft electronics will bring us to a healthier and more intelligent world in the near future.Therefore,in this review,we will give a careful introduction about the new soft material,physiological signal detected by soft devices,and the soft devices assisted by machine learning algorithm.Some soft materials will be discussed such as two-dimensional material,carbon nanotube,nanowire,nanomesh,and hydrogel.Then,soft sensors will be discussed according to the physiological signal types(pulse,respiration,human motion,intraocular pressure,phonation,etc.).After that,the soft electronics assisted by various algorithms will be reviewed,including some classical algorithms and powerful neural network algorithms.Especially,the soft device assisted by neural network will be introduced carefully.Finally,the outlook,challenge,and conclusion of soft system powered by machine learning algorithm will be discussed.
文摘Big data analytic techniques associated with machine learning algorithms are playing an increasingly important role in various application fields,including stock market investment.However,few studies have focused on forecasting daily stock market returns,especially when using powerful machine learning techniques,such as deep neural networks(DNNs),to perform the analyses.DNNs employ various deep learning algorithms based on the combination of network structure,activation function,and model parameters,with their performance depending on the format of the data representation.This paper presents a comprehensive big data analytics process to predict the daily return direction of the SPDR S&P 500 ETF(ticker symbol:SPY)based on 60 financial and economic features.DNNs and traditional artificial neural networks(ANNs)are then deployed over the entire preprocessed but untransformed dataset,along with two datasets transformed via principal component analysis(PCA),to predict the daily direction of future stock market index returns.While controlling for overfitting,a pattern for the classification accuracy of the DNNs is detected and demonstrated as the number of the hidden layers increases gradually from 12 to 1000.Moreover,a set of hypothesis testing procedures are implemented on the classification,and the simulation results show that the DNNs using two PCA-represented datasets give significantly higher classification accuracy than those using the entire untransformed dataset,as well as several other hybrid machine learning algorithms.In addition,the trading strategies guided by the DNN classification process based on PCA-represented data perform slightly better than the others tested,including in a comparison against two standard benchmarks.
基金funded by the Major Program of Social Science Foundation of Tianjin Municipal Education Commission(2019JWZD53).
文摘Periodontitis is closely related to many systemic diseases linked by different periodontal pathogens.To unravel the relationship between periodontitis and systemic diseases,it is very important to correctly discriminate major periodontal pathogens.To realize convenient,effcient,and high-accuracy bacterial species classification,the authors use Raman spectroscopy combined with machine learning algorithms to distinguish three major periodontal pathogens Porphyromonas gingivalis(Pg),Fusobacterium nucleatum(Fn),and Aggregatibacter actinomycetemcomitans(Aa).The result shows that this novel method can successfully discriminate the three abovementioned periodontal pathogens.Moreover,the classification accuracies for the three categories of the original data were 94.7%at the sample level and 93.9%at the spectrum level by the machine learning algorithm extra trees.This study provides a fast,simple,and accurate method which is very beneficial to differentiate periodontal pathogens.
基金funded by the National Key Research and Development Program of China(2017YFA0605002,2017YFA0605004,and 2016YFA0601501)the National Natural Science Foundation of China(41961124007,51779145,and 41830863)“Six top talents”in Jiangsu Province(RJFW-031)。
文摘Model parameters estimation is a pivotal issue for runoff modeling in ungauged catchments.The nonlinear relationship between model parameters and catchment descriptors is a major obstacle for parameter regionalization,which is the most widely used approach.Runoff modeling was studied in 38 catchments located in the Yellow–Huai–Hai River Basin(YHHRB).The values of the Nash–Sutcliffe efficiency coefficient(NSE),coefficient of determination(R2),and percent bias(PBIAS)indicated the acceptable performance of the soil and water assessment tool(SWAT)model in the YHHRB.Nine descriptors belonging to the categories of climate,soil,vegetation,and topography were used to express the catchment characteristics related to the hydrological processes.The quantitative relationships between the parameters of the SWAT model and the catchment descriptors were analyzed by six regression-based models,including linear regression(LR)equations,support vector regression(SVR),random forest(RF),k-nearest neighbor(kNN),decision tree(DT),and radial basis function(RBF).Each of the 38 catchments was assumed to be an ungauged catchment in turn.Then,the parameters in each target catchment were estimated by the constructed regression models based on the remaining 37 donor catchments.Furthermore,the similaritybased regionalization scheme was used for comparison with the regression-based approach.The results indicated that the runoff with the highest accuracy was modeled by the SVR-based scheme in ungauged catchments.Compared with the traditional LR-based approach,the accuracy of the runoff modeling in ungauged catchments was improved by the machine learning algorithms because of the outstanding capability to deal with nonlinear relationships.The performances of different approaches were similar in humid regions,while the advantages of the machine learning techniques were more evident in arid regions.When the study area contained nested catchments,the best result was calculated with the similarity-based parameter regionalization scheme because of the high catchment density and short spatial distance.The new findings could improve flood forecasting and water resources planning in regions that lack observed data.
文摘Some countries have announced national benchmark rates,while others have been working on the recent trend in which the London Interbank Offered Rate will be retired at the end of 2021.Considering that Turkey announced the Turkish Lira Overnight Reference Interest Rate(TLREF),this study examines the determinants of TLREF.In this context,three global determinants,five country-level macroeconomic determinants,and the COVID-19 pandemic are considered by using daily data between December 28,2018,and December 31,2020,by performing machine learning algorithms and Ordinary Least Square.The empirical results show that(1)the most significant determinant is the amount of securities bought by Central Banks;(2)country-level macroeconomic factors have a higher impact whereas global factors are less important,and the pandemic does not have a significant effect;(3)Random Forest is the most accurate prediction model.Taking action by considering the study’s findings can help support economic growth by achieving low-level benchmark rates.
文摘This study aims to empirically analyze teaching-learning-based optimization(TLBO)and machine learning algorithms using k-means and fuzzy c-means(FCM)algorithms for their individual performance evaluation in terms of clustering and classification.In the first phase,the clustering(k-means and FCM)algorithms were employed independently and the clustering accuracy was evaluated using different computationalmeasures.During the second phase,the non-clustered data obtained from the first phase were preprocessed with TLBO.TLBO was performed using k-means(TLBO-KM)and FCM(TLBO-FCM)(TLBO-KM/FCM)algorithms.The objective function was determined by considering both minimization and maximization criteria.Non-clustered data obtained from the first phase were further utilized and fed as input for threshold optimization.Five benchmark datasets were considered from theUniversity of California,Irvine(UCI)Machine Learning Repository for comparative study and experimentation.These are breast cancer Wisconsin(BCW),Pima Indians Diabetes,Heart-Statlog,Hepatitis,and Cleveland Heart Disease datasets.The combined average accuracy obtained collectively is approximately 99.4%in case of TLBO-KM and 98.6%in case of TLBOFCM.This approach is also capable of finding the dominating attributes.The findings indicate that TLBO-KM/FCM,considering different computational measures,perform well on the non-clustered data where k-means and FCM,if employed independently,fail to provide significant results.Evaluating different feature sets,the TLBO-KM/FCM and SVM(GS)clearly outperformed all other classifiers in terms of sensitivity,specificity and accuracy.TLBOKM/FCM attained the highest average sensitivity(98.7%),highest average specificity(98.4%)and highest average accuracy(99.4%)for 10-fold cross validation with different test data.
基金The Natural Science Foundation of Jiangsu Province,China(No.BK20200470)China Postdoctoral Science Foundation(No.2021M691595)Innovation and Entrepreneurship Plan Talent Program of Jiangsu Province(No.AD99002).
文摘The finite element(FE)-based simulation of welding characteristics was carried out to explore the relationship among welding assembly properties for the parallel T-shaped thin-walled parts of an antenna structure.The effects of welding direction,clamping,fixture release time,fixed constraints,and welding sequences on these properties were analyzed,and the mapping relationship among welding characteristics was thoroughly examined.Different machine learning algorithms,including the generalized regression neural network(GRNN),wavelet neural network(WNN),and fuzzy neural network(FNN),are used to predict the multiple welding properties of thin-walled parts to mirror their variation trend and verify the correctness of the mapping relationship.Compared with those from GRNN and WNN,the maximum mean relative errors for the predicted values of deformation,temperature,and residual stress with FNN were less than 4.8%,1.4%,and 4.4%,respectively.These results indicate that FNN generated the best predicted welding characteristics.Analysis under various welding conditions also shows a mapping relationship among welding deformation,temperature,and residual stress over a period of time.This finding further provides a paramount basis for the control of welding assembly errors of an antenna structure in the future.
基金supported by the Qingdao Municipal Bureau of Science and Technology(No.19-6-1-55-nsh)。
文摘Objective:To study the application of a machine learning algorithm for predicting gestational diabetes mellitus(GDM)in early pregnancy.Methods:This study identified indicators related to GDM through a literature review and expert discussion.Pregnant women who had attended medical institutions for an antenatal examination from November 2017 to August 2018 were selected for analysis,and the collected indicators were retrospectively analyzed.Based on Python,the indicators were classified and modeled using a random forest regression algorithm,and the performance of the prediction model was analyzed.Results:We obtained 4806 analyzable data from 1625 pregnant women.Among these,3265 samples with all 67 indicators were used to establish data set F1;4806 samples with 38 identical indicators were used to establish data set F2.Each of F1 and F2 was used for training the random forest algorithm.The overall predictive accuracy of the F1 model was 93.10%,area under the receiver operating characteristic curve(AUC)was 0.66,and the predictive accuracy of GDM-positive cases was 37.10%.The corresponding values for the F2 model were 88.70%,0.87,and 79.44%.The results thus showed that the F2 prediction model performed better than the F1 model.To explore the impact of sacrificial indicators on GDM prediction,the F3 data set was established using 3265 samples(F1)with 38 indicators(F2).After training,the overall predictive accuracy of the F3 model was 91.60%,AUC was 0.58,and the predictive accuracy of positive cases was 15.85%.Conclusions:In this study,a model for predicting GDM with several input variables(e.g.,physical examination,past history,personal history,family history,and laboratory indicators)was established using a random forest regression algorithm.The trained prediction model exhibited a good performance and is valuable as a reference for predicting GDM in women at an early stage of pregnancy.In addition,there are cer tain requirements for the propor tions of negative and positive cases in sample data sets when the random forest algorithm is applied to the early prediction of GDM.
文摘This article firstly explains the concepts of artificial intelligence and algorithm separately,then determines the research status of artificial intelligence and machine learning in the background of the increasing popularity of artificial intelligence,and finally briefly describes the machine learning algorithm in the field of artificial intelligence,as well as puts forward appropriate development prospects,in order to provide theoretical reference for industry insider.
文摘This study examines the feasibility of using a machine learning approach for rapid damage assessment of rein-forced concrete(RC)buildings after the earthquake.Since the real-world damaged datasets are lacking,have limited access,or are imbalanced,a simulation dataset is prepared by conducting a nonlinear time history analy-sis.Different machine learning(ML)models are trained considering the structural parameters and ground motion characteristics to predict the RC building damage into five categories:null,slight,moderate,heavy,and collapse.The random forest classifier(RFC)has achieved a higher prediction accuracy on testing and real-world damaged datasets.The structural parameters can be extracted using different means such as Google Earth,Open Street Map,unmanned aerial vehicles,etc.However,recording the ground motion at a closer distance requires the installation of a dense array of sensors which requires a higher cost.For places with no earthquake recording station/device,it is difficult to have ground motion characteristics.For that different ML-based regressor models are developed utilizing past-earthquake information to predict ground motion parameters such as peak ground acceleration and peak ground velocity.The random forest regressor(RFR)achieved better results than other regression models on testing and validation datasets.Furthermore,compared with the results of similar research works,a better result is obtained using RFC and RFR on validation datasets.In the end,these models are uti-lized to predict the damage categories of RC buildings at Saitama University and Okubo Danchi,Saitama,Japan after an earthquake.This damage information is crucial for government agencies or decision-makers to respond systematically in post-disaster situations.