Landslides pose a formidable natural hazard across the Qinghai-Tibet Plateau(QTP),endangering both ecosystems and human life.Identifying the driving factors behind landslides and accurately assessing susceptibility ar...Landslides pose a formidable natural hazard across the Qinghai-Tibet Plateau(QTP),endangering both ecosystems and human life.Identifying the driving factors behind landslides and accurately assessing susceptibility are key to mitigating disaster risk.This study integrated multi-source historical landslide data with 15 predictive factors and used several machine learning models—Random Forest(RF),Gradient Boosting Regression Trees(GBRT),Extreme Gradient Boosting(XGBoost),and Categorical Boosting(CatBoost)—to generate susceptibility maps.The Shapley additive explanation(SHAP)method was applied to quantify factor importance and explore their nonlinear effects.The results showed that:(1)CatBoost was the best-performing model(CA=0.938,AUC=0.980)in assessing landslide susceptibility,with altitude emerging as the most significant factor,followed by distance to roads and earthquake sites,precipitation,and slope;(2)the SHAP method revealed critical nonlinear thresholds,demonstrating that historical landslides were concentrated at mid-altitudes(1400-4000 m)and decreased markedly above 4000 m,with a parallel reduction in probability beyond 700 m from roads;and(3)landslide-prone areas,comprising 13%of the QTP,were concentrated in the southeastern and northeastern parts of the plateau.By integrating machine learning and SHAP analysis,this study revealed landslide hazard-prone areas and their driving factors,providing insights to support disaster management strategies and sustainable regional planning.展开更多
Software-Defined Network(SDN)decouples the control plane of network devices from the data plane.While alleviating the problems presented in traditional network architectures,it also brings potential security risks,par...Software-Defined Network(SDN)decouples the control plane of network devices from the data plane.While alleviating the problems presented in traditional network architectures,it also brings potential security risks,particularly network Denial-of-Service(DoS)attacks.While many research efforts have been devoted to identifying new features for DoS attack detection,detection methods are less accurate in detecting DoS attacks against client hosts due to the high stealth of such attacks.To solve this problem,a new method of DoS attack detection based on Deep Factorization Machine(DeepFM)is proposed in SDN.Firstly,we select the Growth Rate of Max Matched Packets(GRMMP)in SDN as detection feature.Then,the DeepFM algorithm is used to extract features from flow rules and classify them into dense and discrete features to detect DoS attacks.After training,the model can be used to infer whether SDN is under DoS attacks,and a DeepFM-based detection method for DoS attacks against client host is implemented.Simulation results show that our method can effectively detect DoS attacks in SDN.Compared with the K-Nearest Neighbor(K-NN),Artificial Neural Network(ANN)models,Support Vector Machine(SVM)and Random Forest models,our proposed method outperforms in accuracy,precision and F1 values.展开更多
In the live broadcast process,eye movement characteristics can reflect people’s attention to the product.However,the existing interest degree predictive model research does not consider the eye movement characteristi...In the live broadcast process,eye movement characteristics can reflect people’s attention to the product.However,the existing interest degree predictive model research does not consider the eye movement characteristics.In order to obtain the users’interest in the product more effectively,we will consider the key eye movement indicators.We first collect eye movement characteristics based on the self-developed data processing algorithm fast discriminative model prediction for tracking(FDIMP),and then we add data dimensions to the original data set through information filling.In addition,we apply the deep factorization machine(DeepFM)architecture to simultaneously learn the combination of low-level and high-level features.In order to effectively learn important features and emphasize relatively important features,the multi-head attention mechanism is applied in the interest model.The experimental results on the public data set Criteo show that,compared with the original DeepFM algorithm,the area under curve(AUC)value was improved by up to 9.32%.展开更多
Performing arts and movies have become commercial products with high profit and great market potential. Previous research works have developed comprehensive models to forecast the demand for movies. However,they did n...Performing arts and movies have become commercial products with high profit and great market potential. Previous research works have developed comprehensive models to forecast the demand for movies. However,they did not pay enough attention to the decision support for performing arts which is a special category unlike movies. For performing arts with high-dimensional categorical attributes and limit samples, determining ticket prices in different levels is still a challenge job faced by the producers and distributors. In terms of these difficulties, factorization machine(FM), which can handle huge sparse categorical attributes, is used in this work first. Adaptive stochastic gradient descent(ASGD) and Markov chain Monte Carlo(MCMC) are both explored to estimate the model parameters of FM. FM with ASGD(FM-ASGD) and FM with MCMC(FM-MCMC) both can achieve a better prediction accuracy, compared with a traditional algorithm. In addition, the multi-output model is proposed to determine the price in multiple price levels simultaneously, which avoids the trouble of the models' repeating training. The results also confirm the prediction accuracy of the multi-output model, compared with those from the general single-output model.展开更多
Modern manufacturing systems are expected to undertake multiple tasks, flexible for extensive customization, and that trends make production systems become more and more complicated. The advantage of a complex product...Modern manufacturing systems are expected to undertake multiple tasks, flexible for extensive customization, and that trends make production systems become more and more complicated. The advantage of a complex production system is a capability to fulfill more intensive goods production and to adapt to various parameters in different conditions. The disadvantage of a complex system, on the other hand, with the pace of the increase of complexity, lies in the control difficulties rising dramatically. Moreover, classical methods are reluctant to control a complex system, and searching for the appropriate control policy tends to become more complicated. Thanks to the development of machine learning technology, this problem is provided with more possibilities for the solutions. In this paper, a hybrid machine learning algorithm, integrating genetic algorithm and reinforcement learning algorithm, is proposed to cope with the accuracy of a control policy and system optimization issue in the simulation of a complex manufacturing system. The objective of this paper is to cut down the makespan and the due date in the manufacturing system. Three use cases, based on the different recipe of the product, are employed to validate the algorithm, and the results prove the applicability of the hybrid algorithm. Besides that, some additionally obtained results are beneficial to find out a solution for the complex system optimization and manufacturing system structure transformation.展开更多
The machine loading problem in flexible manufacturing system is addressed in this paper. The problem is modelled as a mixed integer program. A Genetic Algorithm (GA) approach is developed to yield an optimal solution....The machine loading problem in flexible manufacturing system is addressed in this paper. The problem is modelled as a mixed integer program. A Genetic Algorithm (GA) approach is developed to yield an optimal solution. In the genetic algorithm, chromosomes are encoded in term of operation routes. A point to point crossover search operator together with a Cyclic Shifting Mutation (CSM) operator is designed to adapt to the problem. At last computational experience with the model is presented, and the results show that our genetic algorithms are very powerful and suitable to machine loading problems.展开更多
There is no reasonable scientific basis for selecting the excellent teachers of the school’s courses.To solve the practical problem,we firstly give a series of normalization models for defining the key attributes of ...There is no reasonable scientific basis for selecting the excellent teachers of the school’s courses.To solve the practical problem,we firstly give a series of normalization models for defining the key attributes of teachers’professional foundation,course difficulty coefficient,and comprehensive evaluation of teaching.Then,we define a partial weight function to calculate the key attributes,and obtain the partial recommendation values.Next,we construct a highly sparse Teaching Recommendation Factorization Machines(TRFMs)model,which takes the 5-tuples relation including teacher,course,teachers’professional foundation,course difficulty,teaching evaluation as the feature vector,and take partial recommendation value as the recommendation label.Finally,we design a novel Top-N excellent teacher recommendation algorithm based on TRFMs by course classification on the highly sparse dataset.Experimental results show that the proposed TRFMs and recommendation algorithm can accurately realize the recommendation of excellent teachers on a highly sparse historical teaching dataset.The recommendation accuracy is superior to that of the three-dimensional tensor decomposition model algorithm which also solves sparse datasets.The proposed method can be used as a new recommendation method applied to the teaching arrangements in all kinds of schools,which can effectively improve the teaching quality.展开更多
This study explored the concurrent scheduling of machines, tools, and tool transporter(TT) with alternative machines in a multi-machine flexible manufacturing system(FMS), taking into mind the tool transfer durations ...This study explored the concurrent scheduling of machines, tools, and tool transporter(TT) with alternative machines in a multi-machine flexible manufacturing system(FMS), taking into mind the tool transfer durations for minimization of the makespan(MSN). When tools are expensive, just a single copy of every tool kind is made available for use in the FMS system. Because the tools are housed in a central tool magazine(CTM), which then distributes and delivers them to many machines, because there is no longer a need to duplicate the tools in each machine, the associated costs are avoided. Choosing alternative machines for job operations(jb-ons), assigning tools to jb-ons, sequencing jb-ons on machines, and arranging allied trip activities, together with the TT’s loaded trip times and deadheading periods, are all challenges that must be overcome to achieve the goal of minimizing MSN. In addition to a mixed nonlinear integer programming(MNLIP) formulation for this simultaneous scheduling problem, this paper suggests a symbiotic organisms search algorithm(SOSA) for the problem’s solution. This algorithm relies on organisms’ symbiotic interaction strategies to keep living in an ecosystem. The findings demonstrate that SOSA is superior to the Jaya algorithm in providing solutions and that using alternative machines for operations helps bring down MSN.展开更多
BACKGROUND Colorectal polyps are precancerous diseases of colorectal cancer.Early detection and resection of colorectal polyps can effectively reduce the mortality of colorectal cancer.Endoscopic mucosal resection(EMR...BACKGROUND Colorectal polyps are precancerous diseases of colorectal cancer.Early detection and resection of colorectal polyps can effectively reduce the mortality of colorectal cancer.Endoscopic mucosal resection(EMR)is a common polypectomy proce-dure in clinical practice,but it has a high postoperative recurrence rate.Currently,there is no predictive model for the recurrence of colorectal polyps after EMR.AIM To construct and validate a machine learning(ML)model for predicting the risk of colorectal polyp recurrence one year after EMR.METHODS This study retrospectively collected data from 1694 patients at three medical centers in Xuzhou.Additionally,a total of 166 patients were collected to form a prospective validation set.Feature variable screening was conducted using uni-variate and multivariate logistic regression analyses,and five ML algorithms were used to construct the predictive models.The optimal models were evaluated based on different performance metrics.Decision curve analysis(DCA)and SHapley Additive exPlanation(SHAP)analysis were performed to assess clinical applicability and predictor importance.RESULTS Multivariate logistic regression analysis identified 8 independent risk factors for colorectal polyp recurrence one year after EMR(P<0.05).Among the models,eXtreme Gradient Boosting(XGBoost)demonstrated the highest area under the curve(AUC)in the training set,internal validation set,and prospective validation set,with AUCs of 0.909(95%CI:0.89-0.92),0.921(95%CI:0.90-0.94),and 0.963(95%CI:0.94-0.99),respectively.DCA indicated favorable clinical utility for the XGBoost model.SHAP analysis identified smoking history,family history,and age as the top three most important predictors in the model.CONCLUSION The XGBoost model has the best predictive performance and can assist clinicians in providing individualized colonoscopy follow-up recommendations.展开更多
Objective Accurately identifying the key influencing factors of psychological birth trauma in primiparous women is crucial for implementing effective preventive and intervention measures.This study aimed to develop an...Objective Accurately identifying the key influencing factors of psychological birth trauma in primiparous women is crucial for implementing effective preventive and intervention measures.This study aimed to develop and validate an interpretable machine learning prediction model for identifying the key influencing factors of psychological birth trauma in primiparous women.Methods A multicenter cross-sectional study was conducted on primiparous women in four tertiary hospitals in Sichuan Province,southwestern China,from December 2023 to March 2024.The Childbirth Trauma Index was used in assessing psychological birth trauma in primiparous women.Data were collected and randomly divided into a training set(80%,n=289)and a testing set(20%,n=73).Six different machine learning models were trained and tested.Training and prediction were conducted using six machine learning models included Linear Regression,Support Vector Regression,Multilayer Perceptron Regression,eXtreme Gradient Boosting Regression,Random Forest Regression,and Adaptive Boosting Regression.The optimal model was selected based on various performance metrics,and its predictive results were interpreted using SHapley Additive exPlanations(SHAP)and accumulated local effects(ALE).Results Among the six machine learning models,the Multilayer Perceptron Regression model exhibited the best overall performance in the testing set(MAE=3.977,MSE=24.832,R2=0.507,EVS=0.524,RMSE=4.983).In the testing set,the R2 and EVS of the Multilayer Perceptron Regression model increased by 8.3%and 1.2%,respectively,compared to the traditional linear regression model.Meanwhile,the MAE,MSE,and RMSE decreased by 0.4%,7.3%,and 3.7%,respectively,compared to the traditional linear regression model.The SHAP analysis indicated that intrapartum pain,anxiety,postpartum pain,resilience,and planned pregnancy are the most critical influencing factors of psychological birth trauma in primiparous women.The ALE analysis indicated that higher intrapartum pain,anxiety,and postpartum pain scores are risk factors,while higher resilience scores are protective factors.Conclusions Interpretable machine learning prediction models can identify the key influencing factors of psychological birth trauma in primiparous women.SHAP and ALE analyses based on the Multilayer Perceptron Regression model can help healthcare providers understand the complex decision-making logic within a prediction model.This study provides a scientific basis for the early prevention and personalized intervention of psychological birth trauma in primiparous women.展开更多
The endpoint carbon content in the converter is critical for the quality of steel products,and accurately predicting this parameter is an effective way to reduce alloy consumption and improve smelting efficiency.Howev...The endpoint carbon content in the converter is critical for the quality of steel products,and accurately predicting this parameter is an effective way to reduce alloy consumption and improve smelting efficiency.However,most scholars currently focus on modifying methods to enhance model accuracy,while overlooking the extent to which input parameters influence accuracy.To address this issue,in this study,a prediction model for the endpoint carbon content in the converter was developed using factor analysis(FA)and support vector machine(SVM)optimized by improved particle swarm optimization(IPSO).Analysis of the factors influencing the endpoint carbon content during the converter smelting process led to the identification of 21 input parameters.Subsequently,FA was used to reduce the dimensionality of the data and applied to the prediction model.The results demonstrate that the performance of the FA-IPSO-SVM model surpasses several existing methods,such as twin support vector regression and support vector machine.The model achieves hit rates of 89.59%,96.21%,and 98.74%within error ranges of±0.01%,±0.015%,and±0.02%,respectively.Finally,based on the prediction results obtained by sequentially removing input parameters,the parameters were classified into high influence(5%-7%),medium influence(2%-5%),and low influence(0-2%)categories according to their varying degrees of impact on prediction accuracy.This classi-fication provides a reference for selecting input parameters in future prediction models for endpoint carbon content.展开更多
BACKGROUND Ischemic heart disease(IHD)impacts the quality of life and has the highest mortality rate of cardiovascular diseases globally.AIM To compare variations in the parameters of the single-lead electrocardiogram...BACKGROUND Ischemic heart disease(IHD)impacts the quality of life and has the highest mortality rate of cardiovascular diseases globally.AIM To compare variations in the parameters of the single-lead electrocardiogram(ECG)during resting conditions and physical exertion in individuals diagnosed with IHD and those without the condition using vasodilator-induced stress computed tomography(CT)myocardial perfusion imaging as the diagnostic reference standard.METHODS This single center observational study included 80 participants.The participants were aged≥40 years and given an informed written consent to participate in the study.Both groups,G1(n=31)with and G2(n=49)without post stress induced myocardial perfusion defect,passed cardiologist consultation,anthropometric measurements,blood pressure and pulse rate measurement,echocardiography,cardio-ankle vascular index,bicycle ergometry,recording 3-min single-lead ECG(Cardio-Qvark)before and just after bicycle ergometry followed by performing CT myocardial perfusion.The LASSO regression with nested cross-validation was used to find the association between Cardio-Qvark parameters and the existence of the perfusion defect.Statistical processing was performed with the R programming language v4.2,Python v.3.10[^R],and Statistica 12 program.RESULTS Bicycle ergometry yielded an area under the receiver operating characteristic curve of 50.7%[95%confidence interval(CI):0.388-0.625],specificity of 53.1%(95%CI:0.392-0.673),and sensitivity of 48.4%(95%CI:0.306-0.657).In contrast,the Cardio-Qvark test performed notably better with an area under the receiver operating characteristic curve of 67%(95%CI:0.530-0.801),specificity of 75.5%(95%CI:0.628-0.88),and sensitivity of 51.6%(95%CI:0.333-0.695).CONCLUSION The single-lead ECG has a relatively higher diagnostic accuracy compared with bicycle ergometry by using machine learning models,but the difference was not statistically significant.However,further investigations are required to uncover the hidden capabilities of single-lead ECG in IHD diagnosis.展开更多
BACKGROUND Colorectal cancer is a common digestive malignancy,and chemotherapy remains a cornerstone of treatment.Myelosuppression,a frequent hematologic toxicity,poses significant clinical challenges.However,no inter...BACKGROUND Colorectal cancer is a common digestive malignancy,and chemotherapy remains a cornerstone of treatment.Myelosuppression,a frequent hematologic toxicity,poses significant clinical challenges.However,no interpretable machine learning-based nomogram exists to predict chemotherapy-induced myelosuppression in colorectal cancer patients.This study aimed to develop and validate an inter-pretable clinic-machine learning nomogram integrating clinical predictors with multiple algorithms via a feature mapping algorithm.The model provides accurate risk estimation and clinical interpretability,supporting individualized prevention strategies and optimizing decision-making in patients receiving first-line chemotherapy.AIM To develop and validate an interpretable clinic-machine learning nomogram predicting chemotherapy-induced myelosuppression in colorectal cancer.METHODS This retrospective study enrolled 855 colorectal cancer patients receiving first-line chemotherapy.Data were split into training(n=612),validation(n=153),and testing(n=90)cohorts.Ten predictors were identified through least absolute shrinkage and selection operator,decision tree,random forest,and expert con-sensus.Ten machine learning algorithms were applied,with performance assessed by area under the receiver operating characteristic curve(AUC),area under the precision-recall curve(AUPRC),calibration,and decision curves.The optimal model was integrated into a clinic-machine learning nomogram via the feature mapping algorithm,which was internally validated for predictive accuracy and clinical utility.(AUPRC),calibration,and decision curves.The optimal model was integrated into a clinic-machine learning nomogram via the feature mapping algorithm,which was internally validated for predictive accuracy and clinical utility.RESULTS A total of 855 colorectal cancer patients were enrolled,with 765 cases(April 2020 to December 2023)used for model training and validation,and 90 cases(January 2024 to July 2024)for internal testing.Baseline clinical features did not differ significantly between training and validation cohorts(P>0.05).Ten predictors were identified through integrated feature selection and expert consensus,including age,body surface area,body mass index,tumor position,albumin,carcinoembryonic antigen,carbohydrate antigen(CA)19-9,CA125,chemotherapy regimen,and chemotherapy cycles.Among ten machine learning algorithms,extreme gradient boosting achieved the best validation performance(AUC=0.97,AUPRC=0.92,sensitivity=0.79,specificity=0.92,accuracy=0.88).Logistic regression confirmed extra trees and random forest as independent predictors,which were incorporated into a clinic-machine learning nomogram.The clinic-machine learning nomogram demonstrated superior discrimination(AUC=0.96,AUPRC=0.93,accuracy=0.90,specificity=0.95),good calibration,and greater net clinical benefit across a wide probability range(10%-90%).Internal testing further confirmed its robustness and generalizability(AUC=0.95).CONCLUSION The clinic-machine learning nomogram accurately predicts chemotherapy-induced myelosuppression in colorectal cancer,providing interpretability and clinical utility to support individualized risk assessment and treatment decision-making.展开更多
Permeability is one of the main oil reservoir characteristics.It affects potential oil production,well-completion technologies,the choice of enhanced oil recovery methods,and more.The methods used to determine and pre...Permeability is one of the main oil reservoir characteristics.It affects potential oil production,well-completion technologies,the choice of enhanced oil recovery methods,and more.The methods used to determine and predict reservoir permeability have serious shortcomings.This article aims to refine and adapt machine learning techniques using historical data from hydrocarbon field development to evaluate and predict parameters such as the skin factor and permeability of the remote reservoir zone.The article analyzes data from 4045 wells tests in oil fields in the Perm Krai(Russia).An evaluation of the performance of different Machine Learning(ML)al-gorithms in the prediction of the well permeability is performed.Three different real datasets are used to train more than 20 machine learning regressors,whose hyperparameters are optimized using Bayesian Optimization(BO).The resulting models demonstrate significantly better predictive performance compared to traditional methods and the best ML model found is one that never was applied before to this problem.The permeability prediction model is characterized by a high R^(2) adjusted value of 0.799.A promising approach is the integration of machine learning methods and the use of pressure recovery curves to estimate permeability in real-time.The work is unique for its approach to predicting pressure recovery curves during well operation without stopping wells,providing primary data for interpretation.These innovations are exclusive and can improve the accuracy of permeability forecasts.It also reduces well downtime associated with traditional well-testing procedures.The proposed methods pave the way for more efficient and cost-effective reservoir development,ultimately sup-porting better decision-making and resource optimization in oil production.展开更多
BACKGROUND Pancreatic fistula is the most common complication of pancreatic surgeries that causes more serious conditions,including bleeding due to visceral vessel erosion and peritonitis.AIM To develop a machine lear...BACKGROUND Pancreatic fistula is the most common complication of pancreatic surgeries that causes more serious conditions,including bleeding due to visceral vessel erosion and peritonitis.AIM To develop a machine learning(ML)model for postoperative pancreatic fistula and identify significant risk factors of the complication.METHODS A single-center retrospective clinical study was conducted which included 150 patients,who underwent pancreat-oduodenectomy.Logistic regression,random forest,and CatBoost were employed for modeling the biochemical leak(symptomless fistula)and fistula grade B/C(clinically significant complication).The performance was estimated by receiver operating characteristic(ROC)area under the curve(AUC)after 5-fold cross-validation(20%testing and 80%training data).The risk factors were evaluated with the most accurate algorithm,based on the parameter“Importance”(Im),and Kendall correlation,P<0.05.RESULTS The CatBoost algorithm was the most accurate with an AUC of 74%-86%.The study provided results of ML-based modeling and algorithm selection for pancreatic fistula prediction and risk factor evaluation.From 14 parameters we selected the main pre-and intraoperative prognostic factors of all the fistulas:Tumor vascular invasion(Im=24.8%),age(Im=18.6%),and body mass index(Im=16.4%),AUC=74%.The ML model showed that biochemical leak,blood and drain amylase level(Im=21.6%and 16.4%),and blood leukocytes(Im=11.2%)were crucial predictors for subsequent fistula B/C,AUC=86%.Surgical techniques,morphology,and pancreatic duct diameter less than 3 mm were insignificant(Im<5%and no correlations detected).The results were confirmed by correlation analysis.CONCLUSION This study highlights the key predictors of postoperative pancreatic fistula and establishes a robust ML-based model for individualized risk prediction.These findings contribute to the advancement of personalized periop-erative care and may guide targeted preventive strategies.展开更多
Research on the application of machine learning(ML)models to landslide susceptibility assessments has gained popularity in recent years,with a focus primarily on topographic factors derived from digital elevation mode...Research on the application of machine learning(ML)models to landslide susceptibility assessments has gained popularity in recent years,with a focus primarily on topographic factors derived from digital elevation models(DEMs).However,few studies have focused on the explanatory effects of these factors on different models,i.e.whether DEM-based factors affect different models in the same way.This study investigated whether different ML models could yield consistent interpretations of DEM-based factors using explanatory algorithms.Six ML models,including a support vector machine,a neural network,extreme gradient boosting,a random forest,linear regression,and K-nearest neighbors,were trained and evaluated on five geospatial datasets derived from different DEMs.Each dataset contained eight DEM-based and six non-DEM-based factors from 8912 landslide samples.Model performance was assessed using accuracy,precision,recall rate,F1-score,kappa coefficient,and receiver operating characteristic curves.Explanatory analyses,including Shapley additive explanations and partial dependence plots,were also employed to investigate the effects of topographic factors on landslide susceptibility.The results indicate that DEM-based factors consistently influenced different ML models across the datasets.Furthermore,tree-based models outperformed the other models in almost all datasets,while the most suitable DEMs were obtained from Copernicus and TanDEM-X.In addition,the concave surface without potholes on steep slopes are ideal topographic conditions for landslide formation in the study area.This study can benefit the wider landslide research community by clarifying how topographic factors affect ML models.展开更多
Moisture accumulation within road pavements,particularly in unbound granular materials with or without thin sprayed seals,presents significant challenges in high-rainfall regions such as Queensland.This infiltration o...Moisture accumulation within road pavements,particularly in unbound granular materials with or without thin sprayed seals,presents significant challenges in high-rainfall regions such as Queensland.This infiltration often leads to various forms of pavement distress,eventually causing irreversible damage to the pavement structure.The moisture content within pavements exhibits considerable dynamism and directly influenced by environmental factors such as precipitation,air temperature,and relative humidity.This variability underscores the importance of monitoring moisture changes using real-time climatic data to assess pavement conditions for operational management or incorporating these effects during pavement design based on historical climate data.Consequently,there is an increasing demand for advanced,technology-driven methodologies to predict moisture variations based on climatic inputs.Addressing this gap,the present study employs five traditional machine learning(ML)algorithms,K-nearest neighbors(KNN),regression trees,random forest,support vector machines(SVMs),and gaussian process regression(GPR),to forecast moisture levels within pavement layers over time,with varying algorithm complexities.Using data collected from an instrumented road in Brisbane,Australia,which includes pavement moisture and climatic factors,the study develops predictive models to forecast moisture content at future time steps.The approach incorporates current moisture content,rather than averaged values,along with seasonality(both daily and annual),and key climatic factors to predict next step moisture.Model performance is evaluated using R2,MSE,RMSE,and MAPE metrics.Results show that ML algorithms can reliably predict long-term moisture variations in pavements,provided optimal hyperparameters are selected for each algorithm.The best-performing algorithms include KNN(the number of neighbours equals to 15),medium regression tree,medium random forest,coarse SVM,and simple GPR,with medium random forest outperforming the others.The study also identifies the optimal hyperparameter combinations for each algorithm,offering significant advancements in moisture prediction tools for pavement technology。展开更多
Objective:As an age-related neurodegenerative disease,the prevalence of mild cognitive impairment(MCI)increases with age.Within the framework of traditional Chinese medicine,spleen-kidney deficiency syndrome(SKDS)is r...Objective:As an age-related neurodegenerative disease,the prevalence of mild cognitive impairment(MCI)increases with age.Within the framework of traditional Chinese medicine,spleen-kidney deficiency syndrome(SKDS)is recognized as the most frequent MCI subtype.Due to the covert and gradual onset of MCI,in community settings it poses a significant challenge for patients and their families to discern between typical aging and pathological changes.There exists an urgent need to devise a preliminary diagnostic tool designed for community-residing older adults with MCI attributed to SKDS(MCI-SKDS).Methods:This investigation enrolled 312 elderly individuals diagnosed with MCI,who were randomly distributed into training and test datasets at a 3:1 ratio.Five machine learning methods,including logistic regression(LR),decision tree(DT),naive Bayes(NB),support vector machine(SVM),and gradient boosting(GB),were used to build a diagnostic prediction model for MCI-SKDS.Accuracy,sensitivity,specificity,precision,F1 score,and area under the curve were used to evaluate model performance.Furthermore,the clinical applicability of the model was evaluated through decision curve analysis(DCA).Results:The accuracy,precision,specificity and F1 score of the DT model performed best in the training set(test set),with scores of 0.904(0.845),0.875(0.795),0.973(0.875)and 0.973(0.875).The sensitivity of the training set(test set)of the SVM model performed best among the five models with a score of 0.865(0.821).The area under the curve of all five models was greater than 0.9 for the training dataset and greater than 0.8 for the test dataset.The DCA of all models showed good clinical application value.The study identified ten indicators that were significant predictors of MCI-SKDS.Conclusion:The risk prediction index derived from machine learning for the MCI-SKDS prediction model is simple and practical;the model demonstrates good predictive value and clinical applicability,and the DT model had the best performance.展开更多
基金The National Key Research and Development Program of China,No.2023YFC3206601。
文摘Landslides pose a formidable natural hazard across the Qinghai-Tibet Plateau(QTP),endangering both ecosystems and human life.Identifying the driving factors behind landslides and accurately assessing susceptibility are key to mitigating disaster risk.This study integrated multi-source historical landslide data with 15 predictive factors and used several machine learning models—Random Forest(RF),Gradient Boosting Regression Trees(GBRT),Extreme Gradient Boosting(XGBoost),and Categorical Boosting(CatBoost)—to generate susceptibility maps.The Shapley additive explanation(SHAP)method was applied to quantify factor importance and explore their nonlinear effects.The results showed that:(1)CatBoost was the best-performing model(CA=0.938,AUC=0.980)in assessing landslide susceptibility,with altitude emerging as the most significant factor,followed by distance to roads and earthquake sites,precipitation,and slope;(2)the SHAP method revealed critical nonlinear thresholds,demonstrating that historical landslides were concentrated at mid-altitudes(1400-4000 m)and decreased markedly above 4000 m,with a parallel reduction in probability beyond 700 m from roads;and(3)landslide-prone areas,comprising 13%of the QTP,were concentrated in the southeastern and northeastern parts of the plateau.By integrating machine learning and SHAP analysis,this study revealed landslide hazard-prone areas and their driving factors,providing insights to support disaster management strategies and sustainable regional planning.
基金This work was funded by the Researchers Supporting Project No.(RSP-2021/102)King Saud University,Riyadh,Saudi ArabiaThis work was supported by the Research Project on Teaching Reform of General Colleges and Universities in Hunan Province(Grant No.HNJG-2020-0261),China.
文摘Software-Defined Network(SDN)decouples the control plane of network devices from the data plane.While alleviating the problems presented in traditional network architectures,it also brings potential security risks,particularly network Denial-of-Service(DoS)attacks.While many research efforts have been devoted to identifying new features for DoS attack detection,detection methods are less accurate in detecting DoS attacks against client hosts due to the high stealth of such attacks.To solve this problem,a new method of DoS attack detection based on Deep Factorization Machine(DeepFM)is proposed in SDN.Firstly,we select the Growth Rate of Max Matched Packets(GRMMP)in SDN as detection feature.Then,the DeepFM algorithm is used to extract features from flow rules and classify them into dense and discrete features to detect DoS attacks.After training,the model can be used to infer whether SDN is under DoS attacks,and a DeepFM-based detection method for DoS attacks against client host is implemented.Simulation results show that our method can effectively detect DoS attacks in SDN.Compared with the K-Nearest Neighbor(K-NN),Artificial Neural Network(ANN)models,Support Vector Machine(SVM)and Random Forest models,our proposed method outperforms in accuracy,precision and F1 values.
文摘In the live broadcast process,eye movement characteristics can reflect people’s attention to the product.However,the existing interest degree predictive model research does not consider the eye movement characteristics.In order to obtain the users’interest in the product more effectively,we will consider the key eye movement indicators.We first collect eye movement characteristics based on the self-developed data processing algorithm fast discriminative model prediction for tracking(FDIMP),and then we add data dimensions to the original data set through information filling.In addition,we apply the deep factorization machine(DeepFM)architecture to simultaneously learn the combination of low-level and high-level features.In order to effectively learn important features and emphasize relatively important features,the multi-head attention mechanism is applied in the interest model.The experimental results on the public data set Criteo show that,compared with the original DeepFM algorithm,the area under curve(AUC)value was improved by up to 9.32%.
基金the Fund of the Science and Technology Commission of Shanghai Municipality(No.13511506402)
文摘Performing arts and movies have become commercial products with high profit and great market potential. Previous research works have developed comprehensive models to forecast the demand for movies. However,they did not pay enough attention to the decision support for performing arts which is a special category unlike movies. For performing arts with high-dimensional categorical attributes and limit samples, determining ticket prices in different levels is still a challenge job faced by the producers and distributors. In terms of these difficulties, factorization machine(FM), which can handle huge sparse categorical attributes, is used in this work first. Adaptive stochastic gradient descent(ASGD) and Markov chain Monte Carlo(MCMC) are both explored to estimate the model parameters of FM. FM with ASGD(FM-ASGD) and FM with MCMC(FM-MCMC) both can achieve a better prediction accuracy, compared with a traditional algorithm. In addition, the multi-output model is proposed to determine the price in multiple price levels simultaneously, which avoids the trouble of the models' repeating training. The results also confirm the prediction accuracy of the multi-output model, compared with those from the general single-output model.
文摘Modern manufacturing systems are expected to undertake multiple tasks, flexible for extensive customization, and that trends make production systems become more and more complicated. The advantage of a complex production system is a capability to fulfill more intensive goods production and to adapt to various parameters in different conditions. The disadvantage of a complex system, on the other hand, with the pace of the increase of complexity, lies in the control difficulties rising dramatically. Moreover, classical methods are reluctant to control a complex system, and searching for the appropriate control policy tends to become more complicated. Thanks to the development of machine learning technology, this problem is provided with more possibilities for the solutions. In this paper, a hybrid machine learning algorithm, integrating genetic algorithm and reinforcement learning algorithm, is proposed to cope with the accuracy of a control policy and system optimization issue in the simulation of a complex manufacturing system. The objective of this paper is to cut down the makespan and the due date in the manufacturing system. Three use cases, based on the different recipe of the product, are employed to validate the algorithm, and the results prove the applicability of the hybrid algorithm. Besides that, some additionally obtained results are beneficial to find out a solution for the complex system optimization and manufacturing system structure transformation.
文摘The machine loading problem in flexible manufacturing system is addressed in this paper. The problem is modelled as a mixed integer program. A Genetic Algorithm (GA) approach is developed to yield an optimal solution. In the genetic algorithm, chromosomes are encoded in term of operation routes. A point to point crossover search operator together with a Cyclic Shifting Mutation (CSM) operator is designed to adapt to the problem. At last computational experience with the model is presented, and the results show that our genetic algorithms are very powerful and suitable to machine loading problems.
基金This work was supported by the Planning Subject for the 13th Five-Year Plan of Hunan Provincial Educational Sciences under Grant XJK17BXX006,author D.Y,http://ghkt.hntky.com/.
文摘There is no reasonable scientific basis for selecting the excellent teachers of the school’s courses.To solve the practical problem,we firstly give a series of normalization models for defining the key attributes of teachers’professional foundation,course difficulty coefficient,and comprehensive evaluation of teaching.Then,we define a partial weight function to calculate the key attributes,and obtain the partial recommendation values.Next,we construct a highly sparse Teaching Recommendation Factorization Machines(TRFMs)model,which takes the 5-tuples relation including teacher,course,teachers’professional foundation,course difficulty,teaching evaluation as the feature vector,and take partial recommendation value as the recommendation label.Finally,we design a novel Top-N excellent teacher recommendation algorithm based on TRFMs by course classification on the highly sparse dataset.Experimental results show that the proposed TRFMs and recommendation algorithm can accurately realize the recommendation of excellent teachers on a highly sparse historical teaching dataset.The recommendation accuracy is superior to that of the three-dimensional tensor decomposition model algorithm which also solves sparse datasets.The proposed method can be used as a new recommendation method applied to the teaching arrangements in all kinds of schools,which can effectively improve the teaching quality.
文摘This study explored the concurrent scheduling of machines, tools, and tool transporter(TT) with alternative machines in a multi-machine flexible manufacturing system(FMS), taking into mind the tool transfer durations for minimization of the makespan(MSN). When tools are expensive, just a single copy of every tool kind is made available for use in the FMS system. Because the tools are housed in a central tool magazine(CTM), which then distributes and delivers them to many machines, because there is no longer a need to duplicate the tools in each machine, the associated costs are avoided. Choosing alternative machines for job operations(jb-ons), assigning tools to jb-ons, sequencing jb-ons on machines, and arranging allied trip activities, together with the TT’s loaded trip times and deadheading periods, are all challenges that must be overcome to achieve the goal of minimizing MSN. In addition to a mixed nonlinear integer programming(MNLIP) formulation for this simultaneous scheduling problem, this paper suggests a symbiotic organisms search algorithm(SOSA) for the problem’s solution. This algorithm relies on organisms’ symbiotic interaction strategies to keep living in an ecosystem. The findings demonstrate that SOSA is superior to the Jaya algorithm in providing solutions and that using alternative machines for operations helps bring down MSN.
文摘BACKGROUND Colorectal polyps are precancerous diseases of colorectal cancer.Early detection and resection of colorectal polyps can effectively reduce the mortality of colorectal cancer.Endoscopic mucosal resection(EMR)is a common polypectomy proce-dure in clinical practice,but it has a high postoperative recurrence rate.Currently,there is no predictive model for the recurrence of colorectal polyps after EMR.AIM To construct and validate a machine learning(ML)model for predicting the risk of colorectal polyp recurrence one year after EMR.METHODS This study retrospectively collected data from 1694 patients at three medical centers in Xuzhou.Additionally,a total of 166 patients were collected to form a prospective validation set.Feature variable screening was conducted using uni-variate and multivariate logistic regression analyses,and five ML algorithms were used to construct the predictive models.The optimal models were evaluated based on different performance metrics.Decision curve analysis(DCA)and SHapley Additive exPlanation(SHAP)analysis were performed to assess clinical applicability and predictor importance.RESULTS Multivariate logistic regression analysis identified 8 independent risk factors for colorectal polyp recurrence one year after EMR(P<0.05).Among the models,eXtreme Gradient Boosting(XGBoost)demonstrated the highest area under the curve(AUC)in the training set,internal validation set,and prospective validation set,with AUCs of 0.909(95%CI:0.89-0.92),0.921(95%CI:0.90-0.94),and 0.963(95%CI:0.94-0.99),respectively.DCA indicated favorable clinical utility for the XGBoost model.SHAP analysis identified smoking history,family history,and age as the top three most important predictors in the model.CONCLUSION The XGBoost model has the best predictive performance and can assist clinicians in providing individualized colonoscopy follow-up recommendations.
基金supported by the Sichuan Province Nursing Scientific Research Project Plan(H23022)the 2022 Municipal-University Science and Technology Strategic Cooperation Special Fund of Nanchong Science and Technology Bureau(22SXQT0222)。
文摘Objective Accurately identifying the key influencing factors of psychological birth trauma in primiparous women is crucial for implementing effective preventive and intervention measures.This study aimed to develop and validate an interpretable machine learning prediction model for identifying the key influencing factors of psychological birth trauma in primiparous women.Methods A multicenter cross-sectional study was conducted on primiparous women in four tertiary hospitals in Sichuan Province,southwestern China,from December 2023 to March 2024.The Childbirth Trauma Index was used in assessing psychological birth trauma in primiparous women.Data were collected and randomly divided into a training set(80%,n=289)and a testing set(20%,n=73).Six different machine learning models were trained and tested.Training and prediction were conducted using six machine learning models included Linear Regression,Support Vector Regression,Multilayer Perceptron Regression,eXtreme Gradient Boosting Regression,Random Forest Regression,and Adaptive Boosting Regression.The optimal model was selected based on various performance metrics,and its predictive results were interpreted using SHapley Additive exPlanations(SHAP)and accumulated local effects(ALE).Results Among the six machine learning models,the Multilayer Perceptron Regression model exhibited the best overall performance in the testing set(MAE=3.977,MSE=24.832,R2=0.507,EVS=0.524,RMSE=4.983).In the testing set,the R2 and EVS of the Multilayer Perceptron Regression model increased by 8.3%and 1.2%,respectively,compared to the traditional linear regression model.Meanwhile,the MAE,MSE,and RMSE decreased by 0.4%,7.3%,and 3.7%,respectively,compared to the traditional linear regression model.The SHAP analysis indicated that intrapartum pain,anxiety,postpartum pain,resilience,and planned pregnancy are the most critical influencing factors of psychological birth trauma in primiparous women.The ALE analysis indicated that higher intrapartum pain,anxiety,and postpartum pain scores are risk factors,while higher resilience scores are protective factors.Conclusions Interpretable machine learning prediction models can identify the key influencing factors of psychological birth trauma in primiparous women.SHAP and ALE analyses based on the Multilayer Perceptron Regression model can help healthcare providers understand the complex decision-making logic within a prediction model.This study provides a scientific basis for the early prevention and personalized intervention of psychological birth trauma in primiparous women.
基金financially supported by the National Natural Science Foundation of China(No.52174297).
文摘The endpoint carbon content in the converter is critical for the quality of steel products,and accurately predicting this parameter is an effective way to reduce alloy consumption and improve smelting efficiency.However,most scholars currently focus on modifying methods to enhance model accuracy,while overlooking the extent to which input parameters influence accuracy.To address this issue,in this study,a prediction model for the endpoint carbon content in the converter was developed using factor analysis(FA)and support vector machine(SVM)optimized by improved particle swarm optimization(IPSO).Analysis of the factors influencing the endpoint carbon content during the converter smelting process led to the identification of 21 input parameters.Subsequently,FA was used to reduce the dimensionality of the data and applied to the prediction model.The results demonstrate that the performance of the FA-IPSO-SVM model surpasses several existing methods,such as twin support vector regression and support vector machine.The model achieves hit rates of 89.59%,96.21%,and 98.74%within error ranges of±0.01%,±0.015%,and±0.02%,respectively.Finally,based on the prediction results obtained by sequentially removing input parameters,the parameters were classified into high influence(5%-7%),medium influence(2%-5%),and low influence(0-2%)categories according to their varying degrees of impact on prediction accuracy.This classi-fication provides a reference for selecting input parameters in future prediction models for endpoint carbon content.
基金Supported by Government Assignment,No.1023022600020-6RSF Grant,No.24-15-00549Ministry of Science and Higher Education of the Russian Federation within the Framework of State Support for the Creation and Development of World-Class Research Center,No.075-15-2022-304.
文摘BACKGROUND Ischemic heart disease(IHD)impacts the quality of life and has the highest mortality rate of cardiovascular diseases globally.AIM To compare variations in the parameters of the single-lead electrocardiogram(ECG)during resting conditions and physical exertion in individuals diagnosed with IHD and those without the condition using vasodilator-induced stress computed tomography(CT)myocardial perfusion imaging as the diagnostic reference standard.METHODS This single center observational study included 80 participants.The participants were aged≥40 years and given an informed written consent to participate in the study.Both groups,G1(n=31)with and G2(n=49)without post stress induced myocardial perfusion defect,passed cardiologist consultation,anthropometric measurements,blood pressure and pulse rate measurement,echocardiography,cardio-ankle vascular index,bicycle ergometry,recording 3-min single-lead ECG(Cardio-Qvark)before and just after bicycle ergometry followed by performing CT myocardial perfusion.The LASSO regression with nested cross-validation was used to find the association between Cardio-Qvark parameters and the existence of the perfusion defect.Statistical processing was performed with the R programming language v4.2,Python v.3.10[^R],and Statistica 12 program.RESULTS Bicycle ergometry yielded an area under the receiver operating characteristic curve of 50.7%[95%confidence interval(CI):0.388-0.625],specificity of 53.1%(95%CI:0.392-0.673),and sensitivity of 48.4%(95%CI:0.306-0.657).In contrast,the Cardio-Qvark test performed notably better with an area under the receiver operating characteristic curve of 67%(95%CI:0.530-0.801),specificity of 75.5%(95%CI:0.628-0.88),and sensitivity of 51.6%(95%CI:0.333-0.695).CONCLUSION The single-lead ECG has a relatively higher diagnostic accuracy compared with bicycle ergometry by using machine learning models,but the difference was not statistically significant.However,further investigations are required to uncover the hidden capabilities of single-lead ECG in IHD diagnosis.
基金Supported by the Beijing Municipal Natural Science Foundation,No.7252262High Level Chinese Medical Hospital Promotion Project,No.HLCMHPP2023085+2 种基金National Natural Science Foundation of China,No.82174463National Administration of Traditional Chinese Medicine,No.ZYYCXTD-C-C202205China Academy of Chinese Medical Sciences,No.CI2021A01804 and No.2022S469.
文摘BACKGROUND Colorectal cancer is a common digestive malignancy,and chemotherapy remains a cornerstone of treatment.Myelosuppression,a frequent hematologic toxicity,poses significant clinical challenges.However,no interpretable machine learning-based nomogram exists to predict chemotherapy-induced myelosuppression in colorectal cancer patients.This study aimed to develop and validate an inter-pretable clinic-machine learning nomogram integrating clinical predictors with multiple algorithms via a feature mapping algorithm.The model provides accurate risk estimation and clinical interpretability,supporting individualized prevention strategies and optimizing decision-making in patients receiving first-line chemotherapy.AIM To develop and validate an interpretable clinic-machine learning nomogram predicting chemotherapy-induced myelosuppression in colorectal cancer.METHODS This retrospective study enrolled 855 colorectal cancer patients receiving first-line chemotherapy.Data were split into training(n=612),validation(n=153),and testing(n=90)cohorts.Ten predictors were identified through least absolute shrinkage and selection operator,decision tree,random forest,and expert con-sensus.Ten machine learning algorithms were applied,with performance assessed by area under the receiver operating characteristic curve(AUC),area under the precision-recall curve(AUPRC),calibration,and decision curves.The optimal model was integrated into a clinic-machine learning nomogram via the feature mapping algorithm,which was internally validated for predictive accuracy and clinical utility.(AUPRC),calibration,and decision curves.The optimal model was integrated into a clinic-machine learning nomogram via the feature mapping algorithm,which was internally validated for predictive accuracy and clinical utility.RESULTS A total of 855 colorectal cancer patients were enrolled,with 765 cases(April 2020 to December 2023)used for model training and validation,and 90 cases(January 2024 to July 2024)for internal testing.Baseline clinical features did not differ significantly between training and validation cohorts(P>0.05).Ten predictors were identified through integrated feature selection and expert consensus,including age,body surface area,body mass index,tumor position,albumin,carcinoembryonic antigen,carbohydrate antigen(CA)19-9,CA125,chemotherapy regimen,and chemotherapy cycles.Among ten machine learning algorithms,extreme gradient boosting achieved the best validation performance(AUC=0.97,AUPRC=0.92,sensitivity=0.79,specificity=0.92,accuracy=0.88).Logistic regression confirmed extra trees and random forest as independent predictors,which were incorporated into a clinic-machine learning nomogram.The clinic-machine learning nomogram demonstrated superior discrimination(AUC=0.96,AUPRC=0.93,accuracy=0.90,specificity=0.95),good calibration,and greater net clinical benefit across a wide probability range(10%-90%).Internal testing further confirmed its robustness and generalizability(AUC=0.95).CONCLUSION The clinic-machine learning nomogram accurately predicts chemotherapy-induced myelosuppression in colorectal cancer,providing interpretability and clinical utility to support individualized risk assessment and treatment decision-making.
基金funded by the Ministry of Science and Higher Education of the Russian Federation(Project No.FSNM-2024-0005).
文摘Permeability is one of the main oil reservoir characteristics.It affects potential oil production,well-completion technologies,the choice of enhanced oil recovery methods,and more.The methods used to determine and predict reservoir permeability have serious shortcomings.This article aims to refine and adapt machine learning techniques using historical data from hydrocarbon field development to evaluate and predict parameters such as the skin factor and permeability of the remote reservoir zone.The article analyzes data from 4045 wells tests in oil fields in the Perm Krai(Russia).An evaluation of the performance of different Machine Learning(ML)al-gorithms in the prediction of the well permeability is performed.Three different real datasets are used to train more than 20 machine learning regressors,whose hyperparameters are optimized using Bayesian Optimization(BO).The resulting models demonstrate significantly better predictive performance compared to traditional methods and the best ML model found is one that never was applied before to this problem.The permeability prediction model is characterized by a high R^(2) adjusted value of 0.799.A promising approach is the integration of machine learning methods and the use of pressure recovery curves to estimate permeability in real-time.The work is unique for its approach to predicting pressure recovery curves during well operation without stopping wells,providing primary data for interpretation.These innovations are exclusive and can improve the accuracy of permeability forecasts.It also reduces well downtime associated with traditional well-testing procedures.The proposed methods pave the way for more efficient and cost-effective reservoir development,ultimately sup-porting better decision-making and resource optimization in oil production.
文摘BACKGROUND Pancreatic fistula is the most common complication of pancreatic surgeries that causes more serious conditions,including bleeding due to visceral vessel erosion and peritonitis.AIM To develop a machine learning(ML)model for postoperative pancreatic fistula and identify significant risk factors of the complication.METHODS A single-center retrospective clinical study was conducted which included 150 patients,who underwent pancreat-oduodenectomy.Logistic regression,random forest,and CatBoost were employed for modeling the biochemical leak(symptomless fistula)and fistula grade B/C(clinically significant complication).The performance was estimated by receiver operating characteristic(ROC)area under the curve(AUC)after 5-fold cross-validation(20%testing and 80%training data).The risk factors were evaluated with the most accurate algorithm,based on the parameter“Importance”(Im),and Kendall correlation,P<0.05.RESULTS The CatBoost algorithm was the most accurate with an AUC of 74%-86%.The study provided results of ML-based modeling and algorithm selection for pancreatic fistula prediction and risk factor evaluation.From 14 parameters we selected the main pre-and intraoperative prognostic factors of all the fistulas:Tumor vascular invasion(Im=24.8%),age(Im=18.6%),and body mass index(Im=16.4%),AUC=74%.The ML model showed that biochemical leak,blood and drain amylase level(Im=21.6%and 16.4%),and blood leukocytes(Im=11.2%)were crucial predictors for subsequent fistula B/C,AUC=86%.Surgical techniques,morphology,and pancreatic duct diameter less than 3 mm were insignificant(Im<5%and no correlations detected).The results were confirmed by correlation analysis.CONCLUSION This study highlights the key predictors of postoperative pancreatic fistula and establishes a robust ML-based model for individualized risk prediction.These findings contribute to the advancement of personalized periop-erative care and may guide targeted preventive strategies.
基金supported by the National Key Research and Development Program of China(Grant No.2022YFC3003205)the Chengdu University of Technology Postgraduate Innovative Cultivation Program(Grant No.10800-000510-01-022)+1 种基金the Sichuan Science and Technology Program(Grant No.2025ZNSFSC1206)the State Key Laboratory of Geohazard Prevention and Geoenvironment Protection Independent Research Project(Grant No.SKLGP2023Z026).
文摘Research on the application of machine learning(ML)models to landslide susceptibility assessments has gained popularity in recent years,with a focus primarily on topographic factors derived from digital elevation models(DEMs).However,few studies have focused on the explanatory effects of these factors on different models,i.e.whether DEM-based factors affect different models in the same way.This study investigated whether different ML models could yield consistent interpretations of DEM-based factors using explanatory algorithms.Six ML models,including a support vector machine,a neural network,extreme gradient boosting,a random forest,linear regression,and K-nearest neighbors,were trained and evaluated on five geospatial datasets derived from different DEMs.Each dataset contained eight DEM-based and six non-DEM-based factors from 8912 landslide samples.Model performance was assessed using accuracy,precision,recall rate,F1-score,kappa coefficient,and receiver operating characteristic curves.Explanatory analyses,including Shapley additive explanations and partial dependence plots,were also employed to investigate the effects of topographic factors on landslide susceptibility.The results indicate that DEM-based factors consistently influenced different ML models across the datasets.Furthermore,tree-based models outperformed the other models in almost all datasets,while the most suitable DEMs were obtained from Copernicus and TanDEM-X.In addition,the concave surface without potholes on steep slopes are ideal topographic conditions for landslide formation in the study area.This study can benefit the wider landslide research community by clarifying how topographic factors affect ML models.
基金the financial and intellectual support provided by Queensland University of Technology(QUT),Australia,through its Higher Degree Research Program,which played a crucial role in the successful completion of this research study
文摘Moisture accumulation within road pavements,particularly in unbound granular materials with or without thin sprayed seals,presents significant challenges in high-rainfall regions such as Queensland.This infiltration often leads to various forms of pavement distress,eventually causing irreversible damage to the pavement structure.The moisture content within pavements exhibits considerable dynamism and directly influenced by environmental factors such as precipitation,air temperature,and relative humidity.This variability underscores the importance of monitoring moisture changes using real-time climatic data to assess pavement conditions for operational management or incorporating these effects during pavement design based on historical climate data.Consequently,there is an increasing demand for advanced,technology-driven methodologies to predict moisture variations based on climatic inputs.Addressing this gap,the present study employs five traditional machine learning(ML)algorithms,K-nearest neighbors(KNN),regression trees,random forest,support vector machines(SVMs),and gaussian process regression(GPR),to forecast moisture levels within pavement layers over time,with varying algorithm complexities.Using data collected from an instrumented road in Brisbane,Australia,which includes pavement moisture and climatic factors,the study develops predictive models to forecast moisture content at future time steps.The approach incorporates current moisture content,rather than averaged values,along with seasonality(both daily and annual),and key climatic factors to predict next step moisture.Model performance is evaluated using R2,MSE,RMSE,and MAPE metrics.Results show that ML algorithms can reliably predict long-term moisture variations in pavements,provided optimal hyperparameters are selected for each algorithm.The best-performing algorithms include KNN(the number of neighbours equals to 15),medium regression tree,medium random forest,coarse SVM,and simple GPR,with medium random forest outperforming the others.The study also identifies the optimal hyperparameter combinations for each algorithm,offering significant advancements in moisture prediction tools for pavement technology。
基金funded by the National Natural Science Foundation of China(No.82405530,81973921 and 72374068)the Science and Technology Research Project of Hubei Provincial Department of Education(No.B2023098)。
文摘Objective:As an age-related neurodegenerative disease,the prevalence of mild cognitive impairment(MCI)increases with age.Within the framework of traditional Chinese medicine,spleen-kidney deficiency syndrome(SKDS)is recognized as the most frequent MCI subtype.Due to the covert and gradual onset of MCI,in community settings it poses a significant challenge for patients and their families to discern between typical aging and pathological changes.There exists an urgent need to devise a preliminary diagnostic tool designed for community-residing older adults with MCI attributed to SKDS(MCI-SKDS).Methods:This investigation enrolled 312 elderly individuals diagnosed with MCI,who were randomly distributed into training and test datasets at a 3:1 ratio.Five machine learning methods,including logistic regression(LR),decision tree(DT),naive Bayes(NB),support vector machine(SVM),and gradient boosting(GB),were used to build a diagnostic prediction model for MCI-SKDS.Accuracy,sensitivity,specificity,precision,F1 score,and area under the curve were used to evaluate model performance.Furthermore,the clinical applicability of the model was evaluated through decision curve analysis(DCA).Results:The accuracy,precision,specificity and F1 score of the DT model performed best in the training set(test set),with scores of 0.904(0.845),0.875(0.795),0.973(0.875)and 0.973(0.875).The sensitivity of the training set(test set)of the SVM model performed best among the five models with a score of 0.865(0.821).The area under the curve of all five models was greater than 0.9 for the training dataset and greater than 0.8 for the test dataset.The DCA of all models showed good clinical application value.The study identified ten indicators that were significant predictors of MCI-SKDS.Conclusion:The risk prediction index derived from machine learning for the MCI-SKDS prediction model is simple and practical;the model demonstrates good predictive value and clinical applicability,and the DT model had the best performance.