Transforming urban spatial structures to promote green and low-carbon development is an effective strategy.Although prior studies have examined the impact of urban polycentricity on carbon emissions and economic devel...Transforming urban spatial structures to promote green and low-carbon development is an effective strategy.Although prior studies have examined the impact of urban polycentricity on carbon emissions and economic development,research on its role in the synergistic relationship between these factors regarding carbon emission efficiency is limited.Furthermore,existing literature often overlooks nonlinear effects and interactions with other urban variables.This paper analyzed data from 295 Chinese cities in 2020,calculating urban population polycentricity,population dispersion indices,and carbon emission efficiency.Utilizing local spatial autocorrelation tools,we reveal interactions among urban population polycentricity,dispersion,carbon emissions,and carbon emission efficiency.We then employ a gradient boosting decision tree model(GBDT)to explore nonlinear and synergistic effects of polycentric urbanization.Key findings include:1)polycentric urbanization in Chinese cities exhibits significant spatial differentiation characteristics.The Polycentricity index is relatively high in economically developed eastern coastal regions with an overall low level,carbon emissions are concentrated in industrialized north-central cities and some Yangtze River Delta hubs,and carbon emission efficiency is the highest in the Yangtze River Delta while relatively low in Northeast China;there are significant spatially heterogeneous interaction characteristics among population polycentricity,population dispersion,carbon emissions,and carbon emission efficiency.2)Urban population polycentricity contributes 9.42%to total carbon emissions and 6.24%to carbon emission efficiency.3)The polycentricity index has a nonlinear impact on carbon emissions and carbon emission efficiency:no significant effect when below 0.50 or above 0.55,increased carbon emissions in 0.50-0.53,and reduced carbon emissions with improved efficiency in 0.53-0.55.4)The polycentricity index has an interaction effect with other variables;specifically,when the polycentricity index is between 0.53 and 0.55,its interaction with urban gross domestic product(GDP),urban population,urban built-up area,green coverage rate in built-up areas,urban technological expenditure,and the proportion of the output value of the secondary industry will reduce carbon emissions and improve carbon emission efficiency.These findings enhance the understanding of urban spatial structures and carbon emissions,providing valuable insights for policymakers in developing green and low-carbon strategies.展开更多
Accurate and interpretable fault diagnosis in industrial gear systems is essential for ensuring safety,reliability,and predictive maintenance.This study presents an intelligent diagnostic framework utilizing Gradient ...Accurate and interpretable fault diagnosis in industrial gear systems is essential for ensuring safety,reliability,and predictive maintenance.This study presents an intelligent diagnostic framework utilizing Gradient Boosting(GB)for fault detection in gear systems,applied to the Aalto Gear Fault Dataset,which features a wide range of synthetic and realistic gear failure modes under varied operating conditions.The dataset was preprocessed and analyzed using an ensemble GB classifier,yielding high performance across multiple metrics:accuracy of 96.77%,precision of 95.44%,recall of 97.11%,and an F1-score of 96.22%.To enhance trust in model predictions,the study integrates an explainable AI(XAI)framework using SHAP(SHapley Additive exPlanations)to visualize feature contributions and support diagnostic transparency.A flowchart-based architecture is proposed to guide real-world deployment of interpretable fault detection pipelines.The results demonstrate the feasibility of combining predictive performance with interpretability,offering a robust approach for condition monitoring in safety-critical systems.展开更多
BACKGROUND Severe esophagogastric varices(EGVs)significantly affect prognosis of patients with hepatitis B because of the risk of life-threatening hemorrhage.Endoscopy is the gold standard for EGV detection but it is ...BACKGROUND Severe esophagogastric varices(EGVs)significantly affect prognosis of patients with hepatitis B because of the risk of life-threatening hemorrhage.Endoscopy is the gold standard for EGV detection but it is invasive,costly and carries risks.Noninvasive predictive models using ultrasound and serological markers are essential for identifying high-risk patients and optimizing endoscopy utilization.Machine learning(ML)offers a powerful approach to analyze complex clinical data and improve predictive accuracy.This study hypothesized that ML models,utilizing noninvasive ultrasound and serological markers,can accurately predict the risk of EGVs in hepatitis B patients,thereby improving clinical decisionmaking.AIM To construct and validate a noninvasive predictive model using ML for EGVs in hepatitis B patients.METHODS We retrospectively collected ultrasound and serological data from 310 eligible cases,randomly dividing them into training(80%)and validation(20%)groups.Eleven ML algorithms were used to build predictive models.The performance of the models was evaluated using the area under the curve and decision curve analysis.The best-performing model was further analyzed using SHapley Additive exPlanation to interpret feature importance.RESULTS Among the 310 patients,124 were identified as high-risk for EGVs.The extreme gradient boosting model demonstrated the best performance,achieving an area under the curve of 0.96 in the validation set.The model also exhibited high sensitivity(78%),specificity(94%),positive predictive value(84%),negative predictive value(88%),F1 score(83%),and overall accuracy(86%).The top four predictive variables were albumin,prothrombin time,portal vein flow velocity and spleen stiffness.A web-based version of the model was developed for clinical use,providing real-time predictions for high-risk patients.CONCLUSION We identified an efficient noninvasive predictive model using extreme gradient boosting for EGVs among hepatitis B patients.The model,presented as a web application,has potential for screening high-risk EGV patients and can aid clinicians in optimizing the use of endoscopy.展开更多
The first 2^(+)excited states of the nucleus directly reflect the interaction between the shell structure and the nucleus,providing insights into the validity of the shell model and nuclear structure characteristics.A...The first 2^(+)excited states of the nucleus directly reflect the interaction between the shell structure and the nucleus,providing insights into the validity of the shell model and nuclear structure characteristics.Although the features of the first 2^(+)excited states can be measured for stable nuclei and calculated using nuclear models,significant uncertainty remains.This study employs a machine learning model based on a light gradient boosting machine(LightGBM)to investigate the first 2^(+)excited states.Specifically,the training of the LightGBM algorithm and the prediction of the first 2^(+)properties of 642 nuclei are presented.Furthermore,detailed comparisons of the LightGBM predictions were performed with available experimental data,shell model calculations,and Bayesian neural network predictions.The results revealed that the average difference between the LightGBM predictions and the experimental data was 18 times smaller than that obtained by the shell model and only 70%of the BNN prediction results.Considering Mg,Ca,Kr,Sm,and Pb isotopes as examples,it was also observed that LightGBM can effectively reproduce the magic number mutation caused by shell effects,with the energy being as low as 0.04 MeV due to shape coexistence.Therefore,we believe that leveraging LightGBM-based machine learning can profoundly enhance our insights into nuclear structures and provide new avenues for nuclear physics research.展开更多
This study provides an in-depth comparative evaluation of landslide susceptibility using two distinct spatial units:and slope units(SUs)and hydrological response units(HRUs),within Goesan County,South Korea.Leveraging...This study provides an in-depth comparative evaluation of landslide susceptibility using two distinct spatial units:and slope units(SUs)and hydrological response units(HRUs),within Goesan County,South Korea.Leveraging the capabilities of the extreme gradient boosting(XGB)algorithm combined with Shapley Additive Explanations(SHAP),this work assesses the precision and clarity with which each unit predicts areas vulnerable to landslides.SUs focus on the geomorphological features like ridges and valleys,focusing on slope stability and landslide triggers.Conversely,HRUs are established based on a variety of hydrological factors,including land cover,soil type and slope gradients,to encapsulate the dynamic water processes of the region.The methodological framework includes the systematic gathering,preparation and analysis of data,ranging from historical landslide occurrences to topographical and environmental variables like elevation,slope angle and land curvature etc.The XGB algorithm used to construct the Landslide Susceptibility Model(LSM)was combined with SHAP for model interpretation and the results were evaluated using Random Cross-validation(RCV)to ensure accuracy and reliability.To ensure optimal model performance,the XGB algorithm’s hyperparameters were tuned using Differential Evolution,considering multicollinearity-free variables.The results show that SU and HRU are effective for LSM,but their effectiveness varies depending on landscape characteristics.The XGB algorithm demonstrates strong predictive power and SHAP enhances model transparency of the influential variables involved.This work underscores the importance of selecting appropriate assessment units tailored to specific landscape characteristics for accurate LSM.The integration of advanced machine learning techniques with interpretative tools offers a robust framework for landslide susceptibility assessment,improving both predictive capabilities and model interpretability.Future research should integrate broader data sets and explore hybrid analytical models to strengthen the generalizability of these findings across varied geographical settings.展开更多
BACKGROUND Diabetic foot ulcer(DFU)is a serious and destructive complication of diabetes,which has a high amputation rate and carries a huge social burden.Early detection of risk factors and intervention are essential...BACKGROUND Diabetic foot ulcer(DFU)is a serious and destructive complication of diabetes,which has a high amputation rate and carries a huge social burden.Early detection of risk factors and intervention are essential to reduce amputation rates.With the development of artificial intelligence technology,efficient interpretable predictive models can be generated in clinical practice to improve DFU care.AIM To develop and validate an interpretable model for predicting amputation risk in DFU patients.METHODS This retrospective study collected basic data from 599 patients with DFU in Beijing Shijitan Hospital between January 2015 and June 2024.The data set was randomly divided into a training set and test set with fivefold cross-validation.Three binary variable models were built with the eXtreme Gradient Boosting(XGBoost)algorithm to input risk factors that predict amputation probability.The model performance was optimized by adjusting the super parameters.The pre-dictive performance of the three models was expressed by sensitivity,specificity,positive predictive value,negative predictive value and area under the curve(AUC).Visualization of the prediction results was realized through SHapley Additive exPlanation(SHAP).RESULTS A total of 157(26.2%)patients underwent minor amputation during hospitalization and 50(8.3%)had major amputation.All three XGBoost models demonstrated good discriminative ability,with AUC values>0.7.The model for predicting major amputation achieved the highest performance[AUC=0.977,95%confidence interval(CI):0.956-0.998],followed by the minor amputation model(AUC=0.800,95%CI:0.762-0.838)and the non-amputation model(AUC=0.772,95%CI:0.730-0.814).Feature importance ranking of the three models revealed the risk factors for minor and major amputation.Wagner grade 4/5,osteomyelitis,and high C-reactive protein were all considered important predictive variables.CONCLUSION XGBoost effectively predicts diabetic foot amputation risk and provides interpretable insights to support person-alized treatment decisions.展开更多
Cyber-physical systems(CPS)represent a sophisticated integration of computational and physical components that power critical applications such as smart manufacturing,healthcare,and autonomous infrastructure.However,t...Cyber-physical systems(CPS)represent a sophisticated integration of computational and physical components that power critical applications such as smart manufacturing,healthcare,and autonomous infrastructure.However,their extensive reliance on internet connectivity makes them increasingly susceptible to cyber threats,potentially leading to operational failures and data breaches.Furthermore,CPS faces significant threats related to unauthorized access,improper management,and tampering of the content it generates.In this paper,we propose an intrusion detection system(IDS)optimized for CPS environments using a hybrid approach by combining a natureinspired feature selection scheme,such as Grey Wolf Optimization(GWO),in connection with the emerging Light Gradient Boosting Machine(LightGBM)classifier,named as GWO-LightGBM.While gradient boosting methods have been explored in prior IDS research,our novelty lies in proposing a hybrid approach targeting CPS-specific operational constraints,such as low-latency response and accurate detection of rare and critical attack types.We evaluate GWO-LightGBM against GWO-XGBoost,GWO-CatBoost,and an artificial neural network(ANN)baseline using the NSL-KDD and CIC-IDS-2017 benchmark datasets.The proposed models are assessed across multiple metrics,including accuracy,precision,recall,and F1-score,with an emphasis on class-wise performance and training efficiency.The proposed GWO-LightGBM model achieves the highest overall accuracy(99.73%)for NSL-KDD and(99.61%)for CIC-IDS-2017,demonstrating superior performance in detecting minority classes such as Remote-to-Local(R2L)and Other attacks—commonly overlooked by other classifiers.Moreover,the proposed model consumes lower training time,highlighting its practical feasibility and scalability for real-time CPS deployment.展开更多
The methods of network attacks have become increasingly sophisticated,rendering traditional cybersecurity defense mechanisms insufficient to address novel and complex threats effectively.In recent years,artificial int...The methods of network attacks have become increasingly sophisticated,rendering traditional cybersecurity defense mechanisms insufficient to address novel and complex threats effectively.In recent years,artificial intelligence has achieved significant progress in the field of network security.However,many challenges and issues remain,particularly regarding the interpretability of deep learning and ensemble learning algorithms.To address the challenge of enhancing the interpretability of network attack prediction models,this paper proposes a method that combines Light Gradient Boosting Machine(LGBM)and SHapley Additive exPlanations(SHAP).LGBM is employed to model anomalous fluctuations in various network indicators,enabling the rapid and accurate identification and prediction of potential network attack types,thereby facilitating the implementation of timely defense measures,the model achieved an accuracy of 0.977,precision of 0.985,recall of 0.975,and an F1 score of 0.979,demonstrating better performance compared to other models in the domain of network attack prediction.SHAP is utilized to analyze the black-box decision-making process of the model,providing interpretability by quantifying the contribution of each feature to the prediction results and elucidating the relationships between features.The experimental results demonstrate that the network attack predictionmodel based on LGBM exhibits superior accuracy and outstanding predictive capabilities.Moreover,the SHAP-based interpretability analysis significantly improves the model’s transparency and interpretability.展开更多
The database of 254 rockburst events was examined for rockburst damage classification using stochastic gradient boosting (SGB) methods. Five potentially relevant indicators including the stress condition factor, the...The database of 254 rockburst events was examined for rockburst damage classification using stochastic gradient boosting (SGB) methods. Five potentially relevant indicators including the stress condition factor, the ground support system capacity, the excavation span, the geological structure and the peak particle velocity of rockburst sites were analyzed. The performance of the model was evaluated using a 10 folds cross-validation (CV) procedure with 80%of original data during modeling, and an external testing set (20%) was employed to validate the prediction performance of the SGB model. Two accuracy measures for multi-class problems were employed: classification accuracy rate and Cohen’s Kappa. The accuracy analysis together with Kappa for the rockburst damage dataset reveals that the SGB model for the prediction of rockburst damage is acceptable.展开更多
The automatic seizure detection is significant for epilepsy diagnosis and it can alleviate the work intensity of inspecting prolonged electroencephalogram (EEG). This paper presents and investigates a novel machine ...The automatic seizure detection is significant for epilepsy diagnosis and it can alleviate the work intensity of inspecting prolonged electroencephalogram (EEG). This paper presents and investigates a novel machine learning approach utilizing gradient boosting to detect seizures from long-term EEG. We apply relative fluctuation index to extract features of long-term intracranial EEG data. A classifier trained with the gradient boosting algorithm is adopted to discriminate the seizure and non-seizure EEG signals. Smoothing and collar technique are finally used as post-processing in order to improve the detection accuracy further. The seizure detection method is assessed on Freiburg EEG datasets from 21 patients. The experimental results indicate that the proposed method yields an average sensitivity of 94. 60% with a false detection rate of 0. 18/h.展开更多
This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting de...This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting decision tree(GBDT), random forest(RF) and information value(InV) models, were used, and the performances were assessed and compared. In total, 202 landslides were mapped by using a series of field surveys, aerial photographs, and reviews of historical and bibliographical data. Nine causative factors were then considered in landslide susceptibility map generation by using the GBDT, RF and InV models. All of the maps of the causative factors were resampled to a resolution of 28.5 m. Of the 486289 pixels in the area,28526 pixels were landslide pixels, and 457763 pixels were non-landslide pixels. Finally, landslide susceptibility maps were generated by using the three machine learning models, and their performances were assessed through receiver operating characteristic(ROC) curves, the sensitivity, specificity,overall accuracy(OA), and kappa coefficient(KAPPA). The results showed that the GBDT, RF and In V models in overall produced reasonable accurate landslide susceptibility maps. Among these three methods, the GBDT method outperforms the other two machine learning methods, which can provide strong technical support for producing landslide susceptibility maps in TGR.展开更多
Accurate assessment of undrained shear strength(USS)for soft sensitive clays is a great concern in geotechnical engineering practice.This study applies novel data-driven extreme gradient boosting(XGBoost)and random fo...Accurate assessment of undrained shear strength(USS)for soft sensitive clays is a great concern in geotechnical engineering practice.This study applies novel data-driven extreme gradient boosting(XGBoost)and random forest(RF)ensemble learning methods for capturing the relationships between the USS and various basic soil parameters.Based on the soil data sets from TC304 database,a general approach is developed to predict the USS of soft clays using the two machine learning methods above,where five feature variables including the preconsolidation stress(PS),vertical effective stress(VES),liquid limit(LL),plastic limit(PL)and natural water content(W)are adopted.To reduce the dependence on the rule of thumb and inefficient brute-force search,the Bayesian optimization method is applied to determine the appropriate model hyper-parameters of both XGBoost and RF.The developed models are comprehensively compared with three comparison machine learning methods and two transformation models with respect to predictive accuracy and robustness under 5-fold cross-validation(CV).It is shown that XGBoost-based and RF-based methods outperform these approaches.Besides,the XGBoostbased model provides feature importance ranks,which makes it a promising tool in the prediction of geotechnical parameters and enhances the interpretability of model.展开更多
基金Under the auspices of National Natural Science Foundation of China(No.42571300)。
文摘Transforming urban spatial structures to promote green and low-carbon development is an effective strategy.Although prior studies have examined the impact of urban polycentricity on carbon emissions and economic development,research on its role in the synergistic relationship between these factors regarding carbon emission efficiency is limited.Furthermore,existing literature often overlooks nonlinear effects and interactions with other urban variables.This paper analyzed data from 295 Chinese cities in 2020,calculating urban population polycentricity,population dispersion indices,and carbon emission efficiency.Utilizing local spatial autocorrelation tools,we reveal interactions among urban population polycentricity,dispersion,carbon emissions,and carbon emission efficiency.We then employ a gradient boosting decision tree model(GBDT)to explore nonlinear and synergistic effects of polycentric urbanization.Key findings include:1)polycentric urbanization in Chinese cities exhibits significant spatial differentiation characteristics.The Polycentricity index is relatively high in economically developed eastern coastal regions with an overall low level,carbon emissions are concentrated in industrialized north-central cities and some Yangtze River Delta hubs,and carbon emission efficiency is the highest in the Yangtze River Delta while relatively low in Northeast China;there are significant spatially heterogeneous interaction characteristics among population polycentricity,population dispersion,carbon emissions,and carbon emission efficiency.2)Urban population polycentricity contributes 9.42%to total carbon emissions and 6.24%to carbon emission efficiency.3)The polycentricity index has a nonlinear impact on carbon emissions and carbon emission efficiency:no significant effect when below 0.50 or above 0.55,increased carbon emissions in 0.50-0.53,and reduced carbon emissions with improved efficiency in 0.53-0.55.4)The polycentricity index has an interaction effect with other variables;specifically,when the polycentricity index is between 0.53 and 0.55,its interaction with urban gross domestic product(GDP),urban population,urban built-up area,green coverage rate in built-up areas,urban technological expenditure,and the proportion of the output value of the secondary industry will reduce carbon emissions and improve carbon emission efficiency.These findings enhance the understanding of urban spatial structures and carbon emissions,providing valuable insights for policymakers in developing green and low-carbon strategies.
文摘Accurate and interpretable fault diagnosis in industrial gear systems is essential for ensuring safety,reliability,and predictive maintenance.This study presents an intelligent diagnostic framework utilizing Gradient Boosting(GB)for fault detection in gear systems,applied to the Aalto Gear Fault Dataset,which features a wide range of synthetic and realistic gear failure modes under varied operating conditions.The dataset was preprocessed and analyzed using an ensemble GB classifier,yielding high performance across multiple metrics:accuracy of 96.77%,precision of 95.44%,recall of 97.11%,and an F1-score of 96.22%.To enhance trust in model predictions,the study integrates an explainable AI(XAI)framework using SHAP(SHapley Additive exPlanations)to visualize feature contributions and support diagnostic transparency.A flowchart-based architecture is proposed to guide real-world deployment of interpretable fault detection pipelines.The results demonstrate the feasibility of combining predictive performance with interpretability,offering a robust approach for condition monitoring in safety-critical systems.
基金Supported by the Agency Natural Science Foundation of Fujian Province,China,No.2022J011285 and No.2023J011480.
文摘BACKGROUND Severe esophagogastric varices(EGVs)significantly affect prognosis of patients with hepatitis B because of the risk of life-threatening hemorrhage.Endoscopy is the gold standard for EGV detection but it is invasive,costly and carries risks.Noninvasive predictive models using ultrasound and serological markers are essential for identifying high-risk patients and optimizing endoscopy utilization.Machine learning(ML)offers a powerful approach to analyze complex clinical data and improve predictive accuracy.This study hypothesized that ML models,utilizing noninvasive ultrasound and serological markers,can accurately predict the risk of EGVs in hepatitis B patients,thereby improving clinical decisionmaking.AIM To construct and validate a noninvasive predictive model using ML for EGVs in hepatitis B patients.METHODS We retrospectively collected ultrasound and serological data from 310 eligible cases,randomly dividing them into training(80%)and validation(20%)groups.Eleven ML algorithms were used to build predictive models.The performance of the models was evaluated using the area under the curve and decision curve analysis.The best-performing model was further analyzed using SHapley Additive exPlanation to interpret feature importance.RESULTS Among the 310 patients,124 were identified as high-risk for EGVs.The extreme gradient boosting model demonstrated the best performance,achieving an area under the curve of 0.96 in the validation set.The model also exhibited high sensitivity(78%),specificity(94%),positive predictive value(84%),negative predictive value(88%),F1 score(83%),and overall accuracy(86%).The top four predictive variables were albumin,prothrombin time,portal vein flow velocity and spleen stiffness.A web-based version of the model was developed for clinical use,providing real-time predictions for high-risk patients.CONCLUSION We identified an efficient noninvasive predictive model using extreme gradient boosting for EGVs among hepatitis B patients.The model,presented as a web application,has potential for screening high-risk EGV patients and can aid clinicians in optimizing the use of endoscopy.
基金supported by the National Key R&D Program of China (No. 2022YFA1603300)the Romanian Ministry of Research,Innovation and Digitalization under Contract PN 23.21.01.06+1 种基金The ELI-RO project with Contract ELI-RORDI-2024-008 (AMAP)a grant from the Romanian Ministry of Research,Innovation and Digitization,CNCS-UEFIS-CDI,with project numbers PN-Ⅲ-P4-PCE-2021-1014, PN-Ⅲ-P4-PCE-2021-0595, and PN-Ⅲ-P1-1.1-TE2021-1464 within PNCDI Ⅲ
文摘The first 2^(+)excited states of the nucleus directly reflect the interaction between the shell structure and the nucleus,providing insights into the validity of the shell model and nuclear structure characteristics.Although the features of the first 2^(+)excited states can be measured for stable nuclei and calculated using nuclear models,significant uncertainty remains.This study employs a machine learning model based on a light gradient boosting machine(LightGBM)to investigate the first 2^(+)excited states.Specifically,the training of the LightGBM algorithm and the prediction of the first 2^(+)properties of 642 nuclei are presented.Furthermore,detailed comparisons of the LightGBM predictions were performed with available experimental data,shell model calculations,and Bayesian neural network predictions.The results revealed that the average difference between the LightGBM predictions and the experimental data was 18 times smaller than that obtained by the shell model and only 70%of the BNN prediction results.Considering Mg,Ca,Kr,Sm,and Pb isotopes as examples,it was also observed that LightGBM can effectively reproduce the magic number mutation caused by shell effects,with the energy being as low as 0.04 MeV due to shape coexistence.Therefore,we believe that leveraging LightGBM-based machine learning can profoundly enhance our insights into nuclear structures and provide new avenues for nuclear physics research.
基金supported by a National Research Foundation of Korea(NRF)grant funded by the Korean government(MSIT)(RS-2023-00222536).
文摘This study provides an in-depth comparative evaluation of landslide susceptibility using two distinct spatial units:and slope units(SUs)and hydrological response units(HRUs),within Goesan County,South Korea.Leveraging the capabilities of the extreme gradient boosting(XGB)algorithm combined with Shapley Additive Explanations(SHAP),this work assesses the precision and clarity with which each unit predicts areas vulnerable to landslides.SUs focus on the geomorphological features like ridges and valleys,focusing on slope stability and landslide triggers.Conversely,HRUs are established based on a variety of hydrological factors,including land cover,soil type and slope gradients,to encapsulate the dynamic water processes of the region.The methodological framework includes the systematic gathering,preparation and analysis of data,ranging from historical landslide occurrences to topographical and environmental variables like elevation,slope angle and land curvature etc.The XGB algorithm used to construct the Landslide Susceptibility Model(LSM)was combined with SHAP for model interpretation and the results were evaluated using Random Cross-validation(RCV)to ensure accuracy and reliability.To ensure optimal model performance,the XGB algorithm’s hyperparameters were tuned using Differential Evolution,considering multicollinearity-free variables.The results show that SU and HRU are effective for LSM,but their effectiveness varies depending on landscape characteristics.The XGB algorithm demonstrates strong predictive power and SHAP enhances model transparency of the influential variables involved.This work underscores the importance of selecting appropriate assessment units tailored to specific landscape characteristics for accurate LSM.The integration of advanced machine learning techniques with interpretative tools offers a robust framework for landslide susceptibility assessment,improving both predictive capabilities and model interpretability.Future research should integrate broader data sets and explore hybrid analytical models to strengthen the generalizability of these findings across varied geographical settings.
文摘BACKGROUND Diabetic foot ulcer(DFU)is a serious and destructive complication of diabetes,which has a high amputation rate and carries a huge social burden.Early detection of risk factors and intervention are essential to reduce amputation rates.With the development of artificial intelligence technology,efficient interpretable predictive models can be generated in clinical practice to improve DFU care.AIM To develop and validate an interpretable model for predicting amputation risk in DFU patients.METHODS This retrospective study collected basic data from 599 patients with DFU in Beijing Shijitan Hospital between January 2015 and June 2024.The data set was randomly divided into a training set and test set with fivefold cross-validation.Three binary variable models were built with the eXtreme Gradient Boosting(XGBoost)algorithm to input risk factors that predict amputation probability.The model performance was optimized by adjusting the super parameters.The pre-dictive performance of the three models was expressed by sensitivity,specificity,positive predictive value,negative predictive value and area under the curve(AUC).Visualization of the prediction results was realized through SHapley Additive exPlanation(SHAP).RESULTS A total of 157(26.2%)patients underwent minor amputation during hospitalization and 50(8.3%)had major amputation.All three XGBoost models demonstrated good discriminative ability,with AUC values>0.7.The model for predicting major amputation achieved the highest performance[AUC=0.977,95%confidence interval(CI):0.956-0.998],followed by the minor amputation model(AUC=0.800,95%CI:0.762-0.838)and the non-amputation model(AUC=0.772,95%CI:0.730-0.814).Feature importance ranking of the three models revealed the risk factors for minor and major amputation.Wagner grade 4/5,osteomyelitis,and high C-reactive protein were all considered important predictive variables.CONCLUSION XGBoost effectively predicts diabetic foot amputation risk and provides interpretable insights to support person-alized treatment decisions.
基金supported by Culture,Sports and Tourism R&D Program through the Korea Creative Content Agency grant funded by the Ministry of Culture,Sports and Tourism in 2024(Project Name:Global Talent Training Program for Copyright Management Technology in Game Contents,Project Number:RS-2024-00396709,Contribution Rate:100%).
文摘Cyber-physical systems(CPS)represent a sophisticated integration of computational and physical components that power critical applications such as smart manufacturing,healthcare,and autonomous infrastructure.However,their extensive reliance on internet connectivity makes them increasingly susceptible to cyber threats,potentially leading to operational failures and data breaches.Furthermore,CPS faces significant threats related to unauthorized access,improper management,and tampering of the content it generates.In this paper,we propose an intrusion detection system(IDS)optimized for CPS environments using a hybrid approach by combining a natureinspired feature selection scheme,such as Grey Wolf Optimization(GWO),in connection with the emerging Light Gradient Boosting Machine(LightGBM)classifier,named as GWO-LightGBM.While gradient boosting methods have been explored in prior IDS research,our novelty lies in proposing a hybrid approach targeting CPS-specific operational constraints,such as low-latency response and accurate detection of rare and critical attack types.We evaluate GWO-LightGBM against GWO-XGBoost,GWO-CatBoost,and an artificial neural network(ANN)baseline using the NSL-KDD and CIC-IDS-2017 benchmark datasets.The proposed models are assessed across multiple metrics,including accuracy,precision,recall,and F1-score,with an emphasis on class-wise performance and training efficiency.The proposed GWO-LightGBM model achieves the highest overall accuracy(99.73%)for NSL-KDD and(99.61%)for CIC-IDS-2017,demonstrating superior performance in detecting minority classes such as Remote-to-Local(R2L)and Other attacks—commonly overlooked by other classifiers.Moreover,the proposed model consumes lower training time,highlighting its practical feasibility and scalability for real-time CPS deployment.
基金supported by the National Natural Science Foundation of China Project(No.62302540)please visit their website at https://www.nsfc.gov.cn/(accessed on 18 June 2024).
文摘The methods of network attacks have become increasingly sophisticated,rendering traditional cybersecurity defense mechanisms insufficient to address novel and complex threats effectively.In recent years,artificial intelligence has achieved significant progress in the field of network security.However,many challenges and issues remain,particularly regarding the interpretability of deep learning and ensemble learning algorithms.To address the challenge of enhancing the interpretability of network attack prediction models,this paper proposes a method that combines Light Gradient Boosting Machine(LGBM)and SHapley Additive exPlanations(SHAP).LGBM is employed to model anomalous fluctuations in various network indicators,enabling the rapid and accurate identification and prediction of potential network attack types,thereby facilitating the implementation of timely defense measures,the model achieved an accuracy of 0.977,precision of 0.985,recall of 0.975,and an F1 score of 0.979,demonstrating better performance compared to other models in the domain of network attack prediction.SHAP is utilized to analyze the black-box decision-making process of the model,providing interpretability by quantifying the contribution of each feature to the prediction results and elucidating the relationships between features.The experimental results demonstrate that the network attack predictionmodel based on LGBM exhibits superior accuracy and outstanding predictive capabilities.Moreover,the SHAP-based interpretability analysis significantly improves the model’s transparency and interpretability.
基金Project(2015CX005)supported by the Innovation Driven Plan of Central South University of ChinaProject supported by the Sheng Hua Lie Ying Program of Central South University,China
文摘The database of 254 rockburst events was examined for rockburst damage classification using stochastic gradient boosting (SGB) methods. Five potentially relevant indicators including the stress condition factor, the ground support system capacity, the excavation span, the geological structure and the peak particle velocity of rockburst sites were analyzed. The performance of the model was evaluated using a 10 folds cross-validation (CV) procedure with 80%of original data during modeling, and an external testing set (20%) was employed to validate the prediction performance of the SGB model. Two accuracy measures for multi-class problems were employed: classification accuracy rate and Cohen’s Kappa. The accuracy analysis together with Kappa for the rockburst damage dataset reveals that the SGB model for the prediction of rockburst damage is acceptable.
基金Key Program of Natural Science Foundation of Shandong Province(No.ZR2013FZ002)The Program of Science and Technology of Suzhou(No.ZXY2013030)Independent Innovation Foundation of Shandong University(No.11170074611102)
文摘The automatic seizure detection is significant for epilepsy diagnosis and it can alleviate the work intensity of inspecting prolonged electroencephalogram (EEG). This paper presents and investigates a novel machine learning approach utilizing gradient boosting to detect seizures from long-term EEG. We apply relative fluctuation index to extract features of long-term intracranial EEG data. A classifier trained with the gradient boosting algorithm is adopted to discriminate the seizure and non-seizure EEG signals. Smoothing and collar technique are finally used as post-processing in order to improve the detection accuracy further. The seizure detection method is assessed on Freiburg EEG datasets from 21 patients. The experimental results indicate that the proposed method yields an average sensitivity of 94. 60% with a false detection rate of 0. 18/h.
基金This work was supported in part by the National Natural Science Foundation of China(61601418,41602362,61871259)in part by the Opening Foundation of Hunan Engineering and Research Center of Natural Resource Investigation and Monitoring(2020-5)+1 种基金in part by the Qilian Mountain National Park Research Center(Qinghai)(grant number:GKQ2019-01)in part by the Geomatics Technology and Application Key Laboratory of Qinghai Province,Grant No.QHDX-2019-01.
文摘This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting decision tree(GBDT), random forest(RF) and information value(InV) models, were used, and the performances were assessed and compared. In total, 202 landslides were mapped by using a series of field surveys, aerial photographs, and reviews of historical and bibliographical data. Nine causative factors were then considered in landslide susceptibility map generation by using the GBDT, RF and InV models. All of the maps of the causative factors were resampled to a resolution of 28.5 m. Of the 486289 pixels in the area,28526 pixels were landslide pixels, and 457763 pixels were non-landslide pixels. Finally, landslide susceptibility maps were generated by using the three machine learning models, and their performances were assessed through receiver operating characteristic(ROC) curves, the sensitivity, specificity,overall accuracy(OA), and kappa coefficient(KAPPA). The results showed that the GBDT, RF and In V models in overall produced reasonable accurate landslide susceptibility maps. Among these three methods, the GBDT method outperforms the other two machine learning methods, which can provide strong technical support for producing landslide susceptibility maps in TGR.
基金financial support from High-end Foreign Expert Introduction program(No.G20190022002)Chongqing Construction Science and Technology Plan Project(2019-0045)as well as Chongqing Engineering Research Center of Disaster Prevention&Control for Banks and Structures in Three Gorges Reservoir Area(Nos.SXAPGC18ZD01 and SXAPGC18YB03)。
文摘Accurate assessment of undrained shear strength(USS)for soft sensitive clays is a great concern in geotechnical engineering practice.This study applies novel data-driven extreme gradient boosting(XGBoost)and random forest(RF)ensemble learning methods for capturing the relationships between the USS and various basic soil parameters.Based on the soil data sets from TC304 database,a general approach is developed to predict the USS of soft clays using the two machine learning methods above,where five feature variables including the preconsolidation stress(PS),vertical effective stress(VES),liquid limit(LL),plastic limit(PL)and natural water content(W)are adopted.To reduce the dependence on the rule of thumb and inefficient brute-force search,the Bayesian optimization method is applied to determine the appropriate model hyper-parameters of both XGBoost and RF.The developed models are comprehensively compared with three comparison machine learning methods and two transformation models with respect to predictive accuracy and robustness under 5-fold cross-validation(CV).It is shown that XGBoost-based and RF-based methods outperform these approaches.Besides,the XGBoostbased model provides feature importance ranks,which makes it a promising tool in the prediction of geotechnical parameters and enhances the interpretability of model.