CARE—Cloud Archive Repository Express has emerged from algorithmic machine learning, and acts like a “fastlane” to bridge between DATA and wiseCIO where DATA stands for digital archiving & trans-analyt...CARE—Cloud Archive Repository Express has emerged from algorithmic machine learning, and acts like a “fastlane” to bridge between DATA and wiseCIO where DATA stands for digital archiving & trans-analytics, and wiseCIO for web-based intelligent service. CARE incorporates DATA and wiseCIO into a triad for content management and delivery (CMD) to orchestrate Anything as a Service (XaaS) by using mathematical and computational solutions to cloud-based problems. This article presents algorithmic machine learning in CARE for “DNA-like” ingredients with trivial information eliminated through deep learning to support integral content management over DATA and informative delivery on wiseCIO. In particular with algorithmic machine learning, CARE creatively incorporates express tokens for information interchange (eTokin) to promote seamless intercommunications among the CMD triad that enables Anything as a Service and empowers ordinary users to be UNIQ professionals: such as ubiquitous manager on content management and delivery, novel designer on universal interface and user-centric experience, intelligent expert for business intelligence, and quinary liaison with XaaS without explicitly coding required. Furthermore, CMD triad harnesses rapid prototyping for user interface design and propels cohesive assembly from Anything orchestrated as a Service. More importantly, CARE collaboratively as a whole promotes instant publishing over DATA, efficient presentation to end-users via wiseCIO, and diligent intelligence for business, education, and entertainment (iBEE) through highly robotic process automation.展开更多
Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decisio...Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.展开更多
Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status ...Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.展开更多
BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intr...BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.AIM To evaluate the predictive performance of machine learning(ML)algorithms for DCI by comparing three modeling approaches,identify factors influencing DCI,and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.METHODS This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021.Demographic data,past medical history,medication use,and psychological status were collected.The endoscopist assessed DCI using the visual analogue scale.After univariate screening,predictive models were developed using multivariable logistic regression,least absolute shrinkage and selection operator(LASSO)regression,and random forest(RF)algorithms.Model performance was evaluated based on discrimination,calibration,and decision curve analysis(DCA),and results were visualized using nomograms.RESULTS A total of 712 patients(53.8%male;mean age 54.5 years±12.9 years)were included.Logistic regression analysis identified constipation[odds ratio(OR)=2.254,95%confidence interval(CI):1.289-3.931],abdominal circumference(AC)(77.5–91.9 cm,OR=1.895,95%CI:1.065-3.350;AC≥92 cm,OR=1.271,95%CI:0.730-2.188),and anxiety(OR=1.071,95%CI:1.044-1.100)as predictive factors for DCI,validated by LASSO and RF methods.Model performance revealed training/validation sensitivities of 0.826/0.925,0.924/0.868,and 1.000/0.981;specificities of 0.602/0.511,0.510/0.562,and 0.977/0.526;and corresponding area under the receiver operating characteristic curves(AUCs)of 0.780(0.737-0.823)/0.726(0.654-0.799),0.754(0.710-0.798)/0.723(0.656-0.791),and 1.000(1.000-1.000)/0.754(0.688-0.820),respectively.DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37.The RF model demonstrated superior diagnostic accuracy,reflected by perfect training sensitivity(1.000)and highest validation AUC(0.754),outperforming other methods in clinical applicability.CONCLUSION The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models.This approach supports individualized preoperative optimization,enhancing colonoscopy quality through targeted risk stratification.展开更多
This study examines the feasibility of using a machine learning approach for rapid damage assessment of rein-forced concrete(RC)buildings after the earthquake.Since the real-world damaged datasets are lacking,have lim...This study examines the feasibility of using a machine learning approach for rapid damage assessment of rein-forced concrete(RC)buildings after the earthquake.Since the real-world damaged datasets are lacking,have limited access,or are imbalanced,a simulation dataset is prepared by conducting a nonlinear time history analy-sis.Different machine learning(ML)models are trained considering the structural parameters and ground motion characteristics to predict the RC building damage into five categories:null,slight,moderate,heavy,and collapse.The random forest classifier(RFC)has achieved a higher prediction accuracy on testing and real-world damaged datasets.The structural parameters can be extracted using different means such as Google Earth,Open Street Map,unmanned aerial vehicles,etc.However,recording the ground motion at a closer distance requires the installation of a dense array of sensors which requires a higher cost.For places with no earthquake recording station/device,it is difficult to have ground motion characteristics.For that different ML-based regressor models are developed utilizing past-earthquake information to predict ground motion parameters such as peak ground acceleration and peak ground velocity.The random forest regressor(RFR)achieved better results than other regression models on testing and validation datasets.Furthermore,compared with the results of similar research works,a better result is obtained using RFC and RFR on validation datasets.In the end,these models are uti-lized to predict the damage categories of RC buildings at Saitama University and Okubo Danchi,Saitama,Japan after an earthquake.This damage information is crucial for government agencies or decision-makers to respond systematically in post-disaster situations.展开更多
The reasonable quantification of the concrete freezing environment on the Qinghai-Tibet Plateau(QTP)is the primary issue in frost resistant concrete design,which is one of the challenges that the QTP engineering manag...The reasonable quantification of the concrete freezing environment on the Qinghai-Tibet Plateau(QTP)is the primary issue in frost resistant concrete design,which is one of the challenges that the QTP engineering managers should take into account.In this paper,we propose a more realistic method to calculate the number of concrete freeze-thaw cycles(NFTCs)on the QTP.The calculated results show that the NFTCs increase as the altitude of the meteorological station increases with the average NFTCs being 208.7.Four machine learning methods,i.e.,the random forest(RF)model,generalized boosting method(GBM),generalized linear model(GLM),and generalized additive model(GAM),are used to fit the NFTCs.The root mean square error(RMSE)values of the RF,GBM,GLM,and GAM are 32.3,4.3,247.9,and 161.3,respectively.The R^(2)values of the RF,GBM,GLM,and GAM are 0.93,0.99,0.48,and 0.66,respectively.The GBM method performs the best compared to the other three methods,which was shown by the results of RMSE and R^(2)values.The quantitative results from the GBM method indicate that the lowest,medium,and highest NFTC values are distributed in the northern,central,and southern parts of the QTP,respectively.The annual NFTCs in the QTP region are mainly concentrated at 160 and above,and the average NFTCs is 200 across the QTP.Our results can provide scientific guidance and a theoretical basis for the freezing resistance design of concrete in various projects on the QTP.展开更多
BACKGROUND Synchronous liver metastasis(SLM)is a significant contributor to morbidity in colorectal cancer(CRC).There are no effective predictive device integration algorithms to predict adverse SLM events during the ...BACKGROUND Synchronous liver metastasis(SLM)is a significant contributor to morbidity in colorectal cancer(CRC).There are no effective predictive device integration algorithms to predict adverse SLM events during the diagnosis of CRC.AIM To explore the risk factors for SLM in CRC and construct a visual prediction model based on gray-level co-occurrence matrix(GLCM)features collected from magnetic resonance imaging(MRI).METHODS Our study retrospectively enrolled 392 patients with CRC from Yichang Central People’s Hospital from January 2015 to May 2023.Patients were randomly divided into a training and validation group(3:7).The clinical parameters and GLCM features extracted from MRI were included as candidate variables.The prediction model was constructed using a generalized linear regression model,random forest model(RFM),and artificial neural network model.Receiver operating characteristic curves and decision curves were used to evaluate the prediction model.RESULTS Among the 392 patients,48 had SLM(12.24%).We obtained fourteen GLCM imaging data for variable screening of SLM prediction models.Inverse difference,mean sum,sum entropy,sum variance,sum of squares,energy,and difference variance were listed as candidate variables,and the prediction efficiency(area under the curve)of the subsequent RFM in the training set and internal validation set was 0.917[95%confidence interval(95%CI):0.866-0.968]and 0.09(95%CI:0.858-0.960),respectively.CONCLUSION A predictive model combining GLCM image features with machine learning can predict SLM in CRC.This model can assist clinicians in making timely and personalized clinical decisions.展开更多
Fetal macrosomia is associated with maternal and newborn complications due to incorrect fetal weight estimation or inappropriate choice of delivery models.The early screening and evaluation of macrosomia in the third ...Fetal macrosomia is associated with maternal and newborn complications due to incorrect fetal weight estimation or inappropriate choice of delivery models.The early screening and evaluation of macrosomia in the third trimester can improve delivery outcomes and reduce complications.However,traditional clinical and ultrasound examinations face difficulties in obtaining accurate fetal measurements during the third trimester of pregnancy.This study aims to develop a comprehensive predictive model for detecting macrosomia using machine learning(ML)algorithms.The accuracy of macrosomia prediction using logistic regression,k-nearest neighbors,support vector machine,random forest(RF),XGBoost,and LightGBM algorithms was explored.Each approach was trained and validated using data from 3244 pregnant women at a hospital in southern China.The information gain method was employed to identify deterministic features associated with the occurrence of macrosomia.The performance of six ML algorithms based on the recall and area under the curve evaluation metrics were compared.To develop an efficient prediction model,two sets of experiments based on ultrasound examination records within 1-7 days and 8-14 days prior to delivery were conducted.The ensemble model,comprising the RF,XGBoost,and LightGBM algorithms,showed encouraging results.For each experimental group,the proposed ensemble model outperformed other ML approaches and the tra-ditional Hadlock formula.The experimental results indicate that,with the most risk-relevant features,the ML algo-rithms presented in this study can predict macrosomia and assist obstetricians in selecting more appropriate delivery models.展开更多
With the rapid growth of e-commerce and online transactions, e-commerce platforms face a critical challenge: predicting consumer behavior after purchase. This study aimed to forecast such after-sales behavior within t...With the rapid growth of e-commerce and online transactions, e-commerce platforms face a critical challenge: predicting consumer behavior after purchase. This study aimed to forecast such after-sales behavior within the digital retail environment. We utilized four machine learning models: logistic regression, decision tree, random forest, and XGBoost, employing SMOTE oversampling and class weighting techniques to address class imbalance. To bolster the models’ predictive capabilities, we executed pivotal data processing steps, including feature derivation and one-hot encoding. Upon rigorous evaluation of the models’ performance through the 5-fold cross-validation method, the random forest model was identified as the superior performer, excelling in accuracy, F1 score, and AUC value, and was thus deemed the most effective model for anticipating consumer after-sales behavior. The findings from this research offer actionable strategies for e-commerce platforms to refine their after-sales services and enhance customer satisfaction.展开更多
The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artifici...The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artificial Neural Network(ANN),Quadratic Discriminant Analysis(QDA),Linear Discriminant Analysis(LDA),and Naive Bayes(NB),for landslide susceptibility modeling and comparison of their performances.Coupling machine learning algorithms with spatial data types for landslide susceptibility mapping is a vitally important issue.This study was carried out using GIS and R open source software at Abha Basin,Asir Region,Saudi Arabia.First,a total of 243 landslide locations were identified at Abha Basin to prepare the landslide inventory map using different data sources.All the landslide areas were randomly separated into two groups with a ratio of 70%for training and 30%for validating purposes.Twelve landslide-variables were generated for landslide susceptibility modeling,which include altitude,lithology,distance to faults,normalized difference vegetation index(NDVI),landuse/landcover(LULC),distance to roads,slope angle,distance to streams,profile curvature,plan curvature,slope length(LS),and slope-aspect.The area under curve(AUC-ROC)approach has been applied to evaluate,validate,and compare the MLTs performance.The results indicated that AUC values for seven MLTs range from 89.0%for QDA to 95.1%for RF.Our findings showed that the RF(AUC=95.1%)and LDA(AUC=941.7%)have produced the best performances in comparison to other MLTs.The outcome of this study and the landslide susceptibility maps would be useful for environmental protection.展开更多
Due to the combined influences such as ore-forming temperature,fluid and metal sources,sphalerite tends to incorporate diverse contents of trace elements during the formation of different types of Lead-zinc(Pb-Zn)depo...Due to the combined influences such as ore-forming temperature,fluid and metal sources,sphalerite tends to incorporate diverse contents of trace elements during the formation of different types of Lead-zinc(Pb-Zn)deposits.Therefore,trace elements in sphalerite have long been utilized to distinguish Pb-Zn deposit types.However,previous discriminant diagrams usually contain two or three dimensions,which are limited to revealing the complicated interrelations between trace elements of sphalerite and the types of Pb-Zn deposits.In this study,we aim to prove that the sphalerite trace elements can be used to classify the Pb-Zn deposit types and extract key factors from sphalerite trace elements that can dis-criminate Pb-Zn deposit types using machine learning algorithms.A dataset of nearly 3600 sphalerite spot analyses from 95 Pb-Zn deposits worldwide determined by LA-ICP-MS was compiled from peer-reviewed publications,containing 12 elements(Mn,Fe,Co,Cu,Ga,Ge,Ag,Cd,In,Sn,Sb,and Pb)from 5 types,including Sedimentary Exhalative(SEDEX),Mississippi Valley Type(MVT),Volcanic Massive Sulfide(VMS),skarn,and epithermal deposits.Random Forests(RF)is applied to the data processing and the results show that trace elements of sphalerite can successfully discriminate different types of Pb-Zn deposits except for VMS deposits,most of which are falsely distinguished as skarn and epithermal types.To further discriminate VMS deposits,future studies could focus on enlarging the capacity of VMS deposits in datasets and applying other geological factors along with sphalerite trace elements when con-structing the classification model.RF’s feature importance and permutation feature importance were adopted to evaluate the element significance for classification.Besides,a visualized tool,t-distributed stochastic neighbor embedding(t-SNE),was used to verify the results of both classification and evalua-tion.The results presented here show that Mn,Co,and Ge display significant impacts on classification of Pb-Zn deposits and In,Ga,Sn,Cd,and Fe also have relatively important effects compared to the rest ele-ments,confirming that Pb-Zn deposits discrimination is mainly controlled by multi-elements in spha-lerite.Our study hence shows that machine learning algorithm can provide new insights into conventional geochemical analyses,inspiring future research on constructing classification models of mineral deposits using mineral geochemistry data.展开更多
The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as ...The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as the coal seam tendency to rockbursts, the thickness of the coal seam, and the stress level in the seam have to be considered, but also the entire coal seam-surrounding rock system has to be evaluated when trying to predict the rockbursts. However, in hard coal mines, there are stroke or stress-stroke rockbursts in which the fracture of a thick layer of sandstone plays an essential role in predicting rockbursts. The occurrence of rockbursts in coal mines is complex, and their prediction is even more difficult than in other mines. In recent years, the interest in machine learning algorithms for solving complex nonlinear problems has increased, which also applies to geosciences. This study attempts to use machine learning algorithms, i.e. neural network, decision tree, random forest, gradient boosting, and extreme gradient boosting(XGB), to assess the rockburst hazard of an active hard coal mine in the Upper Silesian Coal Basin. The rock mass bursting tendency index WTGthat describes the tendency of the seam-surrounding rock system to rockbursts and the anomaly of the vertical stress component were applied for this purpose. Especially, the decision tree and neural network models were proved to be effective in correctly distinguishing rockbursts from tremors, after which the excavation was not damaged. On average, these models correctly classified about 80% of the rockbursts in the testing datasets.展开更多
This investigation assessed the efficacy of 10 widely used machine learning algorithms(MLA)comprising the least absolute shrinkage and selection operator(LASSO),generalized linear model(GLM),stepwise generalized linea...This investigation assessed the efficacy of 10 widely used machine learning algorithms(MLA)comprising the least absolute shrinkage and selection operator(LASSO),generalized linear model(GLM),stepwise generalized linear model(SGLM),elastic net(ENET),partial least square(PLS),ridge regression,support vector machine(SVM),classification and regression trees(CART),bagged CART,and random forest(RF)for gully erosion susceptibility mapping(GESM)in Iran.The location of 462 previously existing gully erosion sites were mapped through widespread field investigations,of which 70%(323)and 30%(139)of observations were arbitrarily divided for algorithm calibration and validation.Twelve controlling factors for gully erosion,namely,soil texture,annual mean rainfall,digital elevation model(DEM),drainage density,slope,lithology,topographic wetness index(TWI),distance from rivers,aspect,distance from roads,plan curvature,and profile curvature were ranked in terms of their importance using each MLA.The MLA were compared using a training dataset for gully erosion and statistical measures such as RMSE(root mean square error),MAE(mean absolute error),and R-squared.Based on the comparisons among MLA,the RF algorithm exhibited the minimum RMSE and MAE and the maximum value of R-squared,and was therefore selected as the best model.The variable importance evaluation using the RF model revealed that distance from rivers had the highest significance in influencing the occurrence of gully erosion whereas plan curvature had the least importance.According to the GESM generated using RF,most of the study area is predicted to have a low(53.72%)or moderate(29.65%)susceptibility to gully erosion,whereas only a small area is identified to have a high(12.56%)or very high(4.07%)susceptibility.The outcome generated by RF model is validated using the ROC(Receiver Operating Characteristics)curve approach,which returned an area under the curve(AUC)of 0.985,proving the excellent forecasting ability of the model.The GESM prepared using the RF algorithm can aid decision-makers in targeting remedial actions for minimizing the damage caused by gully erosion.展开更多
Due to the development of the novel materials,the past two decades have witnessed the rapid advances of soft electronics.The soft electronics have huge potential in the physical sign monitoring and health care.One of ...Due to the development of the novel materials,the past two decades have witnessed the rapid advances of soft electronics.The soft electronics have huge potential in the physical sign monitoring and health care.One of the important advantages of soft electronics is forming good interface with skin,which can increase the user scale and improve the signal quality.Therefore,it is easy to build the specific dataset,which is important to improve the performance of machine learning algorithm.At the same time,with the assistance of machine learning algorithm,the soft electronics have become more and more intelligent to realize real-time analysis and diagnosis.The soft electronics and machining learning algorithms complement each other very well.It is indubitable that the soft electronics will bring us to a healthier and more intelligent world in the near future.Therefore,in this review,we will give a careful introduction about the new soft material,physiological signal detected by soft devices,and the soft devices assisted by machine learning algorithm.Some soft materials will be discussed such as two-dimensional material,carbon nanotube,nanowire,nanomesh,and hydrogel.Then,soft sensors will be discussed according to the physiological signal types(pulse,respiration,human motion,intraocular pressure,phonation,etc.).After that,the soft electronics assisted by various algorithms will be reviewed,including some classical algorithms and powerful neural network algorithms.Especially,the soft device assisted by neural network will be introduced carefully.Finally,the outlook,challenge,and conclusion of soft system powered by machine learning algorithm will be discussed.展开更多
Big data analytic techniques associated with machine learning algorithms are playing an increasingly important role in various application fields,including stock market investment.However,few studies have focused on f...Big data analytic techniques associated with machine learning algorithms are playing an increasingly important role in various application fields,including stock market investment.However,few studies have focused on forecasting daily stock market returns,especially when using powerful machine learning techniques,such as deep neural networks(DNNs),to perform the analyses.DNNs employ various deep learning algorithms based on the combination of network structure,activation function,and model parameters,with their performance depending on the format of the data representation.This paper presents a comprehensive big data analytics process to predict the daily return direction of the SPDR S&P 500 ETF(ticker symbol:SPY)based on 60 financial and economic features.DNNs and traditional artificial neural networks(ANNs)are then deployed over the entire preprocessed but untransformed dataset,along with two datasets transformed via principal component analysis(PCA),to predict the daily direction of future stock market index returns.While controlling for overfitting,a pattern for the classification accuracy of the DNNs is detected and demonstrated as the number of the hidden layers increases gradually from 12 to 1000.Moreover,a set of hypothesis testing procedures are implemented on the classification,and the simulation results show that the DNNs using two PCA-represented datasets give significantly higher classification accuracy than those using the entire untransformed dataset,as well as several other hybrid machine learning algorithms.In addition,the trading strategies guided by the DNN classification process based on PCA-represented data perform slightly better than the others tested,including in a comparison against two standard benchmarks.展开更多
Model parameters estimation is a pivotal issue for runoff modeling in ungauged catchments.The nonlinear relationship between model parameters and catchment descriptors is a major obstacle for parameter regionalization...Model parameters estimation is a pivotal issue for runoff modeling in ungauged catchments.The nonlinear relationship between model parameters and catchment descriptors is a major obstacle for parameter regionalization,which is the most widely used approach.Runoff modeling was studied in 38 catchments located in the Yellow–Huai–Hai River Basin(YHHRB).The values of the Nash–Sutcliffe efficiency coefficient(NSE),coefficient of determination(R2),and percent bias(PBIAS)indicated the acceptable performance of the soil and water assessment tool(SWAT)model in the YHHRB.Nine descriptors belonging to the categories of climate,soil,vegetation,and topography were used to express the catchment characteristics related to the hydrological processes.The quantitative relationships between the parameters of the SWAT model and the catchment descriptors were analyzed by six regression-based models,including linear regression(LR)equations,support vector regression(SVR),random forest(RF),k-nearest neighbor(kNN),decision tree(DT),and radial basis function(RBF).Each of the 38 catchments was assumed to be an ungauged catchment in turn.Then,the parameters in each target catchment were estimated by the constructed regression models based on the remaining 37 donor catchments.Furthermore,the similaritybased regionalization scheme was used for comparison with the regression-based approach.The results indicated that the runoff with the highest accuracy was modeled by the SVR-based scheme in ungauged catchments.Compared with the traditional LR-based approach,the accuracy of the runoff modeling in ungauged catchments was improved by the machine learning algorithms because of the outstanding capability to deal with nonlinear relationships.The performances of different approaches were similar in humid regions,while the advantages of the machine learning techniques were more evident in arid regions.When the study area contained nested catchments,the best result was calculated with the similarity-based parameter regionalization scheme because of the high catchment density and short spatial distance.The new findings could improve flood forecasting and water resources planning in regions that lack observed data.展开更多
Some countries have announced national benchmark rates,while others have been working on the recent trend in which the London Interbank Offered Rate will be retired at the end of 2021.Considering that Turkey announced...Some countries have announced national benchmark rates,while others have been working on the recent trend in which the London Interbank Offered Rate will be retired at the end of 2021.Considering that Turkey announced the Turkish Lira Overnight Reference Interest Rate(TLREF),this study examines the determinants of TLREF.In this context,three global determinants,five country-level macroeconomic determinants,and the COVID-19 pandemic are considered by using daily data between December 28,2018,and December 31,2020,by performing machine learning algorithms and Ordinary Least Square.The empirical results show that(1)the most significant determinant is the amount of securities bought by Central Banks;(2)country-level macroeconomic factors have a higher impact whereas global factors are less important,and the pandemic does not have a significant effect;(3)Random Forest is the most accurate prediction model.Taking action by considering the study’s findings can help support economic growth by achieving low-level benchmark rates.展开更多
This study aims to empirically analyze teaching-learning-based optimization(TLBO)and machine learning algorithms using k-means and fuzzy c-means(FCM)algorithms for their individual performance evaluation in terms of c...This study aims to empirically analyze teaching-learning-based optimization(TLBO)and machine learning algorithms using k-means and fuzzy c-means(FCM)algorithms for their individual performance evaluation in terms of clustering and classification.In the first phase,the clustering(k-means and FCM)algorithms were employed independently and the clustering accuracy was evaluated using different computationalmeasures.During the second phase,the non-clustered data obtained from the first phase were preprocessed with TLBO.TLBO was performed using k-means(TLBO-KM)and FCM(TLBO-FCM)(TLBO-KM/FCM)algorithms.The objective function was determined by considering both minimization and maximization criteria.Non-clustered data obtained from the first phase were further utilized and fed as input for threshold optimization.Five benchmark datasets were considered from theUniversity of California,Irvine(UCI)Machine Learning Repository for comparative study and experimentation.These are breast cancer Wisconsin(BCW),Pima Indians Diabetes,Heart-Statlog,Hepatitis,and Cleveland Heart Disease datasets.The combined average accuracy obtained collectively is approximately 99.4%in case of TLBO-KM and 98.6%in case of TLBOFCM.This approach is also capable of finding the dominating attributes.The findings indicate that TLBO-KM/FCM,considering different computational measures,perform well on the non-clustered data where k-means and FCM,if employed independently,fail to provide significant results.Evaluating different feature sets,the TLBO-KM/FCM and SVM(GS)clearly outperformed all other classifiers in terms of sensitivity,specificity and accuracy.TLBOKM/FCM attained the highest average sensitivity(98.7%),highest average specificity(98.4%)and highest average accuracy(99.4%)for 10-fold cross validation with different test data.展开更多
Periodontitis is closely related to many systemic diseases linked by different periodontal pathogens.To unravel the relationship between periodontitis and systemic diseases,it is very important to correctly discrimina...Periodontitis is closely related to many systemic diseases linked by different periodontal pathogens.To unravel the relationship between periodontitis and systemic diseases,it is very important to correctly discriminate major periodontal pathogens.To realize convenient,effcient,and high-accuracy bacterial species classification,the authors use Raman spectroscopy combined with machine learning algorithms to distinguish three major periodontal pathogens Porphyromonas gingivalis(Pg),Fusobacterium nucleatum(Fn),and Aggregatibacter actinomycetemcomitans(Aa).The result shows that this novel method can successfully discriminate the three abovementioned periodontal pathogens.Moreover,the classification accuracies for the three categories of the original data were 94.7%at the sample level and 93.9%at the spectrum level by the machine learning algorithm extra trees.This study provides a fast,simple,and accurate method which is very beneficial to differentiate periodontal pathogens.展开更多
The finite element(FE)-based simulation of welding characteristics was carried out to explore the relationship among welding assembly properties for the parallel T-shaped thin-walled parts of an antenna structure.The ...The finite element(FE)-based simulation of welding characteristics was carried out to explore the relationship among welding assembly properties for the parallel T-shaped thin-walled parts of an antenna structure.The effects of welding direction,clamping,fixture release time,fixed constraints,and welding sequences on these properties were analyzed,and the mapping relationship among welding characteristics was thoroughly examined.Different machine learning algorithms,including the generalized regression neural network(GRNN),wavelet neural network(WNN),and fuzzy neural network(FNN),are used to predict the multiple welding properties of thin-walled parts to mirror their variation trend and verify the correctness of the mapping relationship.Compared with those from GRNN and WNN,the maximum mean relative errors for the predicted values of deformation,temperature,and residual stress with FNN were less than 4.8%,1.4%,and 4.4%,respectively.These results indicate that FNN generated the best predicted welding characteristics.Analysis under various welding conditions also shows a mapping relationship among welding deformation,temperature,and residual stress over a period of time.This finding further provides a paramount basis for the control of welding assembly errors of an antenna structure in the future.展开更多
文摘CARE—Cloud Archive Repository Express has emerged from algorithmic machine learning, and acts like a “fastlane” to bridge between DATA and wiseCIO where DATA stands for digital archiving & trans-analytics, and wiseCIO for web-based intelligent service. CARE incorporates DATA and wiseCIO into a triad for content management and delivery (CMD) to orchestrate Anything as a Service (XaaS) by using mathematical and computational solutions to cloud-based problems. This article presents algorithmic machine learning in CARE for “DNA-like” ingredients with trivial information eliminated through deep learning to support integral content management over DATA and informative delivery on wiseCIO. In particular with algorithmic machine learning, CARE creatively incorporates express tokens for information interchange (eTokin) to promote seamless intercommunications among the CMD triad that enables Anything as a Service and empowers ordinary users to be UNIQ professionals: such as ubiquitous manager on content management and delivery, novel designer on universal interface and user-centric experience, intelligent expert for business intelligence, and quinary liaison with XaaS without explicitly coding required. Furthermore, CMD triad harnesses rapid prototyping for user interface design and propels cohesive assembly from Anything orchestrated as a Service. More importantly, CARE collaboratively as a whole promotes instant publishing over DATA, efficient presentation to end-users via wiseCIO, and diligent intelligence for business, education, and entertainment (iBEE) through highly robotic process automation.
基金financially supported by the National Natural Science Foundation of China(No.52073031)the National Key Research and Development Program of China(Nos.2023YFB3208102,2021YFB3200304)+4 种基金the China National Postdoctoral Program for Innovative Talents(No.BX2021302)the Beijing Nova Program(Nos.Z191100001119047,Z211100002121148)the Fundamental Research Funds for the Central Universities(No.E0EG6801X2)the‘Hundred Talents Program’of the Chinese Academy of Sciencesthe BrainLink program funded by the MSIT through the NRF of Korea(No.RS-2023-00237308).
文摘Neuromorphic computing extends beyond sequential processing modalities and outperforms traditional von Neumann architectures in implementing more complicated tasks,e.g.,pattern processing,image recognition,and decision making.It features parallel interconnected neural networks,high fault tolerance,robustness,autonomous learning capability,and ultralow energy dissipation.The algorithms of artificial neural network(ANN)have also been widely used because of their facile self-organization and self-learning capabilities,which mimic those of the human brain.To some extent,ANN reflects several basic functions of the human brain and can be efficiently integrated into neuromorphic devices to perform neuromorphic computations.This review highlights recent advances in neuromorphic devices assisted by machine learning algorithms.First,the basic structure of simple neuron models inspired by biological neurons and the information processing in simple neural networks are particularly discussed.Second,the fabrication and research progress of neuromorphic devices are presented regarding to materials and structures.Furthermore,the fabrication of neuromorphic devices,including stand-alone neuromorphic devices,neuromorphic device arrays,and integrated neuromorphic systems,is discussed and demonstrated with reference to some respective studies.The applications of neuromorphic devices assisted by machine learning algorithms in different fields are categorized and investigated.Finally,perspectives,suggestions,and potential solutions to the current challenges of neuromorphic devices are provided.
文摘Based on the Google Earth Engine cloud computing data platform,this study employed three algorithms including Support Vector Machine,Random Forest,and Classification and Regression Tree to classify the current status of land covers in Hung Yen province of Vietnam using Landsat 8 OLI satellite images,a free data source with reasonable spatial and temporal resolution.The results of the study show that all three algorithms presented good classification for five basic types of land cover including Rice land,Water bodies,Perennial vegetation,Annual vegetation,Built-up areas as their overall accuracy and Kappa coefficient were greater than 80%and 0.8,respectively.Among the three algorithms,SVM achieved the highest accuracy as its overall accuracy was 86%and the Kappa coefficient was 0.88.Land cover classification based on the SVM algorithm shows that Built-up areas cover the largest area with nearly 31,495 ha,accounting for more than 33.8%of the total natural area,followed by Rice land and Perennial vegetation which cover an area of over 30,767 ha(33%)and 15,637 ha(16.8%),respectively.Water bodies and Annual vegetation cover the smallest areas with 8,820(9.5%)ha and 6,302 ha(6.8%),respectively.The results of this study can be used for land use management and planning as well as other natural resource and environmental management purposes in the province.
基金the Chinese Clinical Trial Registry(No.ChiCTR2000040109)approved by the Hospital Ethics Committee(No.20210130017).
文摘BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.AIM To evaluate the predictive performance of machine learning(ML)algorithms for DCI by comparing three modeling approaches,identify factors influencing DCI,and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.METHODS This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021.Demographic data,past medical history,medication use,and psychological status were collected.The endoscopist assessed DCI using the visual analogue scale.After univariate screening,predictive models were developed using multivariable logistic regression,least absolute shrinkage and selection operator(LASSO)regression,and random forest(RF)algorithms.Model performance was evaluated based on discrimination,calibration,and decision curve analysis(DCA),and results were visualized using nomograms.RESULTS A total of 712 patients(53.8%male;mean age 54.5 years±12.9 years)were included.Logistic regression analysis identified constipation[odds ratio(OR)=2.254,95%confidence interval(CI):1.289-3.931],abdominal circumference(AC)(77.5–91.9 cm,OR=1.895,95%CI:1.065-3.350;AC≥92 cm,OR=1.271,95%CI:0.730-2.188),and anxiety(OR=1.071,95%CI:1.044-1.100)as predictive factors for DCI,validated by LASSO and RF methods.Model performance revealed training/validation sensitivities of 0.826/0.925,0.924/0.868,and 1.000/0.981;specificities of 0.602/0.511,0.510/0.562,and 0.977/0.526;and corresponding area under the receiver operating characteristic curves(AUCs)of 0.780(0.737-0.823)/0.726(0.654-0.799),0.754(0.710-0.798)/0.723(0.656-0.791),and 1.000(1.000-1.000)/0.754(0.688-0.820),respectively.DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37.The RF model demonstrated superior diagnostic accuracy,reflected by perfect training sensitivity(1.000)and highest validation AUC(0.754),outperforming other methods in clinical applicability.CONCLUSION The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models.This approach supports individualized preoperative optimization,enhancing colonoscopy quality through targeted risk stratification.
文摘This study examines the feasibility of using a machine learning approach for rapid damage assessment of rein-forced concrete(RC)buildings after the earthquake.Since the real-world damaged datasets are lacking,have limited access,or are imbalanced,a simulation dataset is prepared by conducting a nonlinear time history analy-sis.Different machine learning(ML)models are trained considering the structural parameters and ground motion characteristics to predict the RC building damage into five categories:null,slight,moderate,heavy,and collapse.The random forest classifier(RFC)has achieved a higher prediction accuracy on testing and real-world damaged datasets.The structural parameters can be extracted using different means such as Google Earth,Open Street Map,unmanned aerial vehicles,etc.However,recording the ground motion at a closer distance requires the installation of a dense array of sensors which requires a higher cost.For places with no earthquake recording station/device,it is difficult to have ground motion characteristics.For that different ML-based regressor models are developed utilizing past-earthquake information to predict ground motion parameters such as peak ground acceleration and peak ground velocity.The random forest regressor(RFR)achieved better results than other regression models on testing and validation datasets.Furthermore,compared with the results of similar research works,a better result is obtained using RFC and RFR on validation datasets.In the end,these models are uti-lized to predict the damage categories of RC buildings at Saitama University and Okubo Danchi,Saitama,Japan after an earthquake.This damage information is crucial for government agencies or decision-makers to respond systematically in post-disaster situations.
基金supported by Shandong Provincial Natural Science Foundation(grant number:ZR2023MD036)Key Research and Development Project in Shandong Province(grant number:2019GGX101064)project for excellent youth foundation of the innovation teacher team,Shandong(grant number:2022KJ310)。
文摘The reasonable quantification of the concrete freezing environment on the Qinghai-Tibet Plateau(QTP)is the primary issue in frost resistant concrete design,which is one of the challenges that the QTP engineering managers should take into account.In this paper,we propose a more realistic method to calculate the number of concrete freeze-thaw cycles(NFTCs)on the QTP.The calculated results show that the NFTCs increase as the altitude of the meteorological station increases with the average NFTCs being 208.7.Four machine learning methods,i.e.,the random forest(RF)model,generalized boosting method(GBM),generalized linear model(GLM),and generalized additive model(GAM),are used to fit the NFTCs.The root mean square error(RMSE)values of the RF,GBM,GLM,and GAM are 32.3,4.3,247.9,and 161.3,respectively.The R^(2)values of the RF,GBM,GLM,and GAM are 0.93,0.99,0.48,and 0.66,respectively.The GBM method performs the best compared to the other three methods,which was shown by the results of RMSE and R^(2)values.The quantitative results from the GBM method indicate that the lowest,medium,and highest NFTC values are distributed in the northern,central,and southern parts of the QTP,respectively.The annual NFTCs in the QTP region are mainly concentrated at 160 and above,and the average NFTCs is 200 across the QTP.Our results can provide scientific guidance and a theoretical basis for the freezing resistance design of concrete in various projects on the QTP.
文摘BACKGROUND Synchronous liver metastasis(SLM)is a significant contributor to morbidity in colorectal cancer(CRC).There are no effective predictive device integration algorithms to predict adverse SLM events during the diagnosis of CRC.AIM To explore the risk factors for SLM in CRC and construct a visual prediction model based on gray-level co-occurrence matrix(GLCM)features collected from magnetic resonance imaging(MRI).METHODS Our study retrospectively enrolled 392 patients with CRC from Yichang Central People’s Hospital from January 2015 to May 2023.Patients were randomly divided into a training and validation group(3:7).The clinical parameters and GLCM features extracted from MRI were included as candidate variables.The prediction model was constructed using a generalized linear regression model,random forest model(RFM),and artificial neural network model.Receiver operating characteristic curves and decision curves were used to evaluate the prediction model.RESULTS Among the 392 patients,48 had SLM(12.24%).We obtained fourteen GLCM imaging data for variable screening of SLM prediction models.Inverse difference,mean sum,sum entropy,sum variance,sum of squares,energy,and difference variance were listed as candidate variables,and the prediction efficiency(area under the curve)of the subsequent RFM in the training set and internal validation set was 0.917[95%confidence interval(95%CI):0.866-0.968]and 0.09(95%CI:0.858-0.960),respectively.CONCLUSION A predictive model combining GLCM image features with machine learning can predict SLM in CRC.This model can assist clinicians in making timely and personalized clinical decisions.
基金supported by the High Level-Hospital Program,Health Commission of Guangdong Province,China,No.HKUSZH201901011the Shenzhen Science and Technology Program,No.JCYJ20220530142017038.
文摘Fetal macrosomia is associated with maternal and newborn complications due to incorrect fetal weight estimation or inappropriate choice of delivery models.The early screening and evaluation of macrosomia in the third trimester can improve delivery outcomes and reduce complications.However,traditional clinical and ultrasound examinations face difficulties in obtaining accurate fetal measurements during the third trimester of pregnancy.This study aims to develop a comprehensive predictive model for detecting macrosomia using machine learning(ML)algorithms.The accuracy of macrosomia prediction using logistic regression,k-nearest neighbors,support vector machine,random forest(RF),XGBoost,and LightGBM algorithms was explored.Each approach was trained and validated using data from 3244 pregnant women at a hospital in southern China.The information gain method was employed to identify deterministic features associated with the occurrence of macrosomia.The performance of six ML algorithms based on the recall and area under the curve evaluation metrics were compared.To develop an efficient prediction model,two sets of experiments based on ultrasound examination records within 1-7 days and 8-14 days prior to delivery were conducted.The ensemble model,comprising the RF,XGBoost,and LightGBM algorithms,showed encouraging results.For each experimental group,the proposed ensemble model outperformed other ML approaches and the tra-ditional Hadlock formula.The experimental results indicate that,with the most risk-relevant features,the ML algo-rithms presented in this study can predict macrosomia and assist obstetricians in selecting more appropriate delivery models.
文摘With the rapid growth of e-commerce and online transactions, e-commerce platforms face a critical challenge: predicting consumer behavior after purchase. This study aimed to forecast such after-sales behavior within the digital retail environment. We utilized four machine learning models: logistic regression, decision tree, random forest, and XGBoost, employing SMOTE oversampling and class weighting techniques to address class imbalance. To bolster the models’ predictive capabilities, we executed pivotal data processing steps, including feature derivation and one-hot encoding. Upon rigorous evaluation of the models’ performance through the 5-fold cross-validation method, the random forest model was identified as the superior performer, excelling in accuracy, F1 score, and AUC value, and was thus deemed the most effective model for anticipating consumer after-sales behavior. The findings from this research offer actionable strategies for e-commerce platforms to refine their after-sales services and enhance customer satisfaction.
文摘The current study aimed at evaluating the capabilities of seven advanced machine learning techniques(MLTs),including,Support Vector Machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artificial Neural Network(ANN),Quadratic Discriminant Analysis(QDA),Linear Discriminant Analysis(LDA),and Naive Bayes(NB),for landslide susceptibility modeling and comparison of their performances.Coupling machine learning algorithms with spatial data types for landslide susceptibility mapping is a vitally important issue.This study was carried out using GIS and R open source software at Abha Basin,Asir Region,Saudi Arabia.First,a total of 243 landslide locations were identified at Abha Basin to prepare the landslide inventory map using different data sources.All the landslide areas were randomly separated into two groups with a ratio of 70%for training and 30%for validating purposes.Twelve landslide-variables were generated for landslide susceptibility modeling,which include altitude,lithology,distance to faults,normalized difference vegetation index(NDVI),landuse/landcover(LULC),distance to roads,slope angle,distance to streams,profile curvature,plan curvature,slope length(LS),and slope-aspect.The area under curve(AUC-ROC)approach has been applied to evaluate,validate,and compare the MLTs performance.The results indicated that AUC values for seven MLTs range from 89.0%for QDA to 95.1%for RF.Our findings showed that the RF(AUC=95.1%)and LDA(AUC=941.7%)have produced the best performances in comparison to other MLTs.The outcome of this study and the landslide susceptibility maps would be useful for environmental protection.
基金We would like to acknowledge the financial support of the Ministry of Science and Technology of China(Grant No.2021YFC2900300)the National Natural Science Foundation of China(Grant Nos.41772074 and 42172103).
文摘Due to the combined influences such as ore-forming temperature,fluid and metal sources,sphalerite tends to incorporate diverse contents of trace elements during the formation of different types of Lead-zinc(Pb-Zn)deposits.Therefore,trace elements in sphalerite have long been utilized to distinguish Pb-Zn deposit types.However,previous discriminant diagrams usually contain two or three dimensions,which are limited to revealing the complicated interrelations between trace elements of sphalerite and the types of Pb-Zn deposits.In this study,we aim to prove that the sphalerite trace elements can be used to classify the Pb-Zn deposit types and extract key factors from sphalerite trace elements that can dis-criminate Pb-Zn deposit types using machine learning algorithms.A dataset of nearly 3600 sphalerite spot analyses from 95 Pb-Zn deposits worldwide determined by LA-ICP-MS was compiled from peer-reviewed publications,containing 12 elements(Mn,Fe,Co,Cu,Ga,Ge,Ag,Cd,In,Sn,Sb,and Pb)from 5 types,including Sedimentary Exhalative(SEDEX),Mississippi Valley Type(MVT),Volcanic Massive Sulfide(VMS),skarn,and epithermal deposits.Random Forests(RF)is applied to the data processing and the results show that trace elements of sphalerite can successfully discriminate different types of Pb-Zn deposits except for VMS deposits,most of which are falsely distinguished as skarn and epithermal types.To further discriminate VMS deposits,future studies could focus on enlarging the capacity of VMS deposits in datasets and applying other geological factors along with sphalerite trace elements when con-structing the classification model.RF’s feature importance and permutation feature importance were adopted to evaluate the element significance for classification.Besides,a visualized tool,t-distributed stochastic neighbor embedding(t-SNE),was used to verify the results of both classification and evalua-tion.The results presented here show that Mn,Co,and Ge display significant impacts on classification of Pb-Zn deposits and In,Ga,Sn,Cd,and Fe also have relatively important effects compared to the rest ele-ments,confirming that Pb-Zn deposits discrimination is mainly controlled by multi-elements in spha-lerite.Our study hence shows that machine learning algorithm can provide new insights into conventional geochemical analyses,inspiring future research on constructing classification models of mineral deposits using mineral geochemistry data.
基金supported by the Ministry of Science and Higher Education, Republic of Poland (Statutory Activity of the Central Mining Institute, Grant No. 11133010)
文摘The risk of rockbursts is one of the main threats in hard coal mines. Compared to other underground mines, the number of factors contributing to the rockburst at underground coal mines is much greater.Factors such as the coal seam tendency to rockbursts, the thickness of the coal seam, and the stress level in the seam have to be considered, but also the entire coal seam-surrounding rock system has to be evaluated when trying to predict the rockbursts. However, in hard coal mines, there are stroke or stress-stroke rockbursts in which the fracture of a thick layer of sandstone plays an essential role in predicting rockbursts. The occurrence of rockbursts in coal mines is complex, and their prediction is even more difficult than in other mines. In recent years, the interest in machine learning algorithms for solving complex nonlinear problems has increased, which also applies to geosciences. This study attempts to use machine learning algorithms, i.e. neural network, decision tree, random forest, gradient boosting, and extreme gradient boosting(XGB), to assess the rockburst hazard of an active hard coal mine in the Upper Silesian Coal Basin. The rock mass bursting tendency index WTGthat describes the tendency of the seam-surrounding rock system to rockbursts and the anomaly of the vertical stress component were applied for this purpose. Especially, the decision tree and neural network models were proved to be effective in correctly distinguishing rockbursts from tremors, after which the excavation was not damaged. On average, these models correctly classified about 80% of the rockbursts in the testing datasets.
基金supported by the College of Agriculture,Shiraz University(Grant No.97GRC1M271143)funding from the UK Biotechnology and Biological Sciences Research Council(BBSRC)funded by BBSRC grant award BBS/E/C/000I0330–Soil to Nutrition project 3–Sustainable intensification:optimisation at multiple scales。
文摘This investigation assessed the efficacy of 10 widely used machine learning algorithms(MLA)comprising the least absolute shrinkage and selection operator(LASSO),generalized linear model(GLM),stepwise generalized linear model(SGLM),elastic net(ENET),partial least square(PLS),ridge regression,support vector machine(SVM),classification and regression trees(CART),bagged CART,and random forest(RF)for gully erosion susceptibility mapping(GESM)in Iran.The location of 462 previously existing gully erosion sites were mapped through widespread field investigations,of which 70%(323)and 30%(139)of observations were arbitrarily divided for algorithm calibration and validation.Twelve controlling factors for gully erosion,namely,soil texture,annual mean rainfall,digital elevation model(DEM),drainage density,slope,lithology,topographic wetness index(TWI),distance from rivers,aspect,distance from roads,plan curvature,and profile curvature were ranked in terms of their importance using each MLA.The MLA were compared using a training dataset for gully erosion and statistical measures such as RMSE(root mean square error),MAE(mean absolute error),and R-squared.Based on the comparisons among MLA,the RF algorithm exhibited the minimum RMSE and MAE and the maximum value of R-squared,and was therefore selected as the best model.The variable importance evaluation using the RF model revealed that distance from rivers had the highest significance in influencing the occurrence of gully erosion whereas plan curvature had the least importance.According to the GESM generated using RF,most of the study area is predicted to have a low(53.72%)or moderate(29.65%)susceptibility to gully erosion,whereas only a small area is identified to have a high(12.56%)or very high(4.07%)susceptibility.The outcome generated by RF model is validated using the ROC(Receiver Operating Characteristics)curve approach,which returned an area under the curve(AUC)of 0.985,proving the excellent forecasting ability of the model.The GESM prepared using the RF algorithm can aid decision-makers in targeting remedial actions for minimizing the damage caused by gully erosion.
基金supported by National Natural Science Foundation of China(No.62201624,32000939,21775168,22174167,51861145202,U20A20168)the Guangdong Basic and Applied Basic Research Foundation(2019A1515111183)+3 种基金Shenzhen Research Funding Program(JCYJ20190807160401657,JCYJ201908073000608,JCYJ20150831192224146)the National Key R&D Program(2018YFC2001202)the support of the Research Fund from Tsinghua University Initiative Scientific Research Programthe support from Key Laboratory of Sensing Technology and Biomedical Instruments of Guangdong Province(No.2020B1212060077)。
文摘Due to the development of the novel materials,the past two decades have witnessed the rapid advances of soft electronics.The soft electronics have huge potential in the physical sign monitoring and health care.One of the important advantages of soft electronics is forming good interface with skin,which can increase the user scale and improve the signal quality.Therefore,it is easy to build the specific dataset,which is important to improve the performance of machine learning algorithm.At the same time,with the assistance of machine learning algorithm,the soft electronics have become more and more intelligent to realize real-time analysis and diagnosis.The soft electronics and machining learning algorithms complement each other very well.It is indubitable that the soft electronics will bring us to a healthier and more intelligent world in the near future.Therefore,in this review,we will give a careful introduction about the new soft material,physiological signal detected by soft devices,and the soft devices assisted by machine learning algorithm.Some soft materials will be discussed such as two-dimensional material,carbon nanotube,nanowire,nanomesh,and hydrogel.Then,soft sensors will be discussed according to the physiological signal types(pulse,respiration,human motion,intraocular pressure,phonation,etc.).After that,the soft electronics assisted by various algorithms will be reviewed,including some classical algorithms and powerful neural network algorithms.Especially,the soft device assisted by neural network will be introduced carefully.Finally,the outlook,challenge,and conclusion of soft system powered by machine learning algorithm will be discussed.
文摘Big data analytic techniques associated with machine learning algorithms are playing an increasingly important role in various application fields,including stock market investment.However,few studies have focused on forecasting daily stock market returns,especially when using powerful machine learning techniques,such as deep neural networks(DNNs),to perform the analyses.DNNs employ various deep learning algorithms based on the combination of network structure,activation function,and model parameters,with their performance depending on the format of the data representation.This paper presents a comprehensive big data analytics process to predict the daily return direction of the SPDR S&P 500 ETF(ticker symbol:SPY)based on 60 financial and economic features.DNNs and traditional artificial neural networks(ANNs)are then deployed over the entire preprocessed but untransformed dataset,along with two datasets transformed via principal component analysis(PCA),to predict the daily direction of future stock market index returns.While controlling for overfitting,a pattern for the classification accuracy of the DNNs is detected and demonstrated as the number of the hidden layers increases gradually from 12 to 1000.Moreover,a set of hypothesis testing procedures are implemented on the classification,and the simulation results show that the DNNs using two PCA-represented datasets give significantly higher classification accuracy than those using the entire untransformed dataset,as well as several other hybrid machine learning algorithms.In addition,the trading strategies guided by the DNN classification process based on PCA-represented data perform slightly better than the others tested,including in a comparison against two standard benchmarks.
基金funded by the National Key Research and Development Program of China(2017YFA0605002,2017YFA0605004,and 2016YFA0601501)the National Natural Science Foundation of China(41961124007,51779145,and 41830863)“Six top talents”in Jiangsu Province(RJFW-031)。
文摘Model parameters estimation is a pivotal issue for runoff modeling in ungauged catchments.The nonlinear relationship between model parameters and catchment descriptors is a major obstacle for parameter regionalization,which is the most widely used approach.Runoff modeling was studied in 38 catchments located in the Yellow–Huai–Hai River Basin(YHHRB).The values of the Nash–Sutcliffe efficiency coefficient(NSE),coefficient of determination(R2),and percent bias(PBIAS)indicated the acceptable performance of the soil and water assessment tool(SWAT)model in the YHHRB.Nine descriptors belonging to the categories of climate,soil,vegetation,and topography were used to express the catchment characteristics related to the hydrological processes.The quantitative relationships between the parameters of the SWAT model and the catchment descriptors were analyzed by six regression-based models,including linear regression(LR)equations,support vector regression(SVR),random forest(RF),k-nearest neighbor(kNN),decision tree(DT),and radial basis function(RBF).Each of the 38 catchments was assumed to be an ungauged catchment in turn.Then,the parameters in each target catchment were estimated by the constructed regression models based on the remaining 37 donor catchments.Furthermore,the similaritybased regionalization scheme was used for comparison with the regression-based approach.The results indicated that the runoff with the highest accuracy was modeled by the SVR-based scheme in ungauged catchments.Compared with the traditional LR-based approach,the accuracy of the runoff modeling in ungauged catchments was improved by the machine learning algorithms because of the outstanding capability to deal with nonlinear relationships.The performances of different approaches were similar in humid regions,while the advantages of the machine learning techniques were more evident in arid regions.When the study area contained nested catchments,the best result was calculated with the similarity-based parameter regionalization scheme because of the high catchment density and short spatial distance.The new findings could improve flood forecasting and water resources planning in regions that lack observed data.
文摘Some countries have announced national benchmark rates,while others have been working on the recent trend in which the London Interbank Offered Rate will be retired at the end of 2021.Considering that Turkey announced the Turkish Lira Overnight Reference Interest Rate(TLREF),this study examines the determinants of TLREF.In this context,three global determinants,five country-level macroeconomic determinants,and the COVID-19 pandemic are considered by using daily data between December 28,2018,and December 31,2020,by performing machine learning algorithms and Ordinary Least Square.The empirical results show that(1)the most significant determinant is the amount of securities bought by Central Banks;(2)country-level macroeconomic factors have a higher impact whereas global factors are less important,and the pandemic does not have a significant effect;(3)Random Forest is the most accurate prediction model.Taking action by considering the study’s findings can help support economic growth by achieving low-level benchmark rates.
文摘This study aims to empirically analyze teaching-learning-based optimization(TLBO)and machine learning algorithms using k-means and fuzzy c-means(FCM)algorithms for their individual performance evaluation in terms of clustering and classification.In the first phase,the clustering(k-means and FCM)algorithms were employed independently and the clustering accuracy was evaluated using different computationalmeasures.During the second phase,the non-clustered data obtained from the first phase were preprocessed with TLBO.TLBO was performed using k-means(TLBO-KM)and FCM(TLBO-FCM)(TLBO-KM/FCM)algorithms.The objective function was determined by considering both minimization and maximization criteria.Non-clustered data obtained from the first phase were further utilized and fed as input for threshold optimization.Five benchmark datasets were considered from theUniversity of California,Irvine(UCI)Machine Learning Repository for comparative study and experimentation.These are breast cancer Wisconsin(BCW),Pima Indians Diabetes,Heart-Statlog,Hepatitis,and Cleveland Heart Disease datasets.The combined average accuracy obtained collectively is approximately 99.4%in case of TLBO-KM and 98.6%in case of TLBOFCM.This approach is also capable of finding the dominating attributes.The findings indicate that TLBO-KM/FCM,considering different computational measures,perform well on the non-clustered data where k-means and FCM,if employed independently,fail to provide significant results.Evaluating different feature sets,the TLBO-KM/FCM and SVM(GS)clearly outperformed all other classifiers in terms of sensitivity,specificity and accuracy.TLBOKM/FCM attained the highest average sensitivity(98.7%),highest average specificity(98.4%)and highest average accuracy(99.4%)for 10-fold cross validation with different test data.
基金funded by the Major Program of Social Science Foundation of Tianjin Municipal Education Commission(2019JWZD53).
文摘Periodontitis is closely related to many systemic diseases linked by different periodontal pathogens.To unravel the relationship between periodontitis and systemic diseases,it is very important to correctly discriminate major periodontal pathogens.To realize convenient,effcient,and high-accuracy bacterial species classification,the authors use Raman spectroscopy combined with machine learning algorithms to distinguish three major periodontal pathogens Porphyromonas gingivalis(Pg),Fusobacterium nucleatum(Fn),and Aggregatibacter actinomycetemcomitans(Aa).The result shows that this novel method can successfully discriminate the three abovementioned periodontal pathogens.Moreover,the classification accuracies for the three categories of the original data were 94.7%at the sample level and 93.9%at the spectrum level by the machine learning algorithm extra trees.This study provides a fast,simple,and accurate method which is very beneficial to differentiate periodontal pathogens.
基金The Natural Science Foundation of Jiangsu Province,China(No.BK20200470)China Postdoctoral Science Foundation(No.2021M691595)Innovation and Entrepreneurship Plan Talent Program of Jiangsu Province(No.AD99002).
文摘The finite element(FE)-based simulation of welding characteristics was carried out to explore the relationship among welding assembly properties for the parallel T-shaped thin-walled parts of an antenna structure.The effects of welding direction,clamping,fixture release time,fixed constraints,and welding sequences on these properties were analyzed,and the mapping relationship among welding characteristics was thoroughly examined.Different machine learning algorithms,including the generalized regression neural network(GRNN),wavelet neural network(WNN),and fuzzy neural network(FNN),are used to predict the multiple welding properties of thin-walled parts to mirror their variation trend and verify the correctness of the mapping relationship.Compared with those from GRNN and WNN,the maximum mean relative errors for the predicted values of deformation,temperature,and residual stress with FNN were less than 4.8%,1.4%,and 4.4%,respectively.These results indicate that FNN generated the best predicted welding characteristics.Analysis under various welding conditions also shows a mapping relationship among welding deformation,temperature,and residual stress over a period of time.This finding further provides a paramount basis for the control of welding assembly errors of an antenna structure in the future.