The methods of network attacks have become increasingly sophisticated,rendering traditional cybersecurity defense mechanisms insufficient to address novel and complex threats effectively.In recent years,artificial int...The methods of network attacks have become increasingly sophisticated,rendering traditional cybersecurity defense mechanisms insufficient to address novel and complex threats effectively.In recent years,artificial intelligence has achieved significant progress in the field of network security.However,many challenges and issues remain,particularly regarding the interpretability of deep learning and ensemble learning algorithms.To address the challenge of enhancing the interpretability of network attack prediction models,this paper proposes a method that combines Light Gradient Boosting Machine(LGBM)and SHapley Additive exPlanations(SHAP).LGBM is employed to model anomalous fluctuations in various network indicators,enabling the rapid and accurate identification and prediction of potential network attack types,thereby facilitating the implementation of timely defense measures,the model achieved an accuracy of 0.977,precision of 0.985,recall of 0.975,and an F1 score of 0.979,demonstrating better performance compared to other models in the domain of network attack prediction.SHAP is utilized to analyze the black-box decision-making process of the model,providing interpretability by quantifying the contribution of each feature to the prediction results and elucidating the relationships between features.The experimental results demonstrate that the network attack predictionmodel based on LGBM exhibits superior accuracy and outstanding predictive capabilities.Moreover,the SHAP-based interpretability analysis significantly improves the model’s transparency and interpretability.展开更多
Accurate reservoir permeability determination is crucial in hydrocarbon exploration and production.Conventional methods relying on empirical correlations and assumptions often result in high costs,time consumption,ina...Accurate reservoir permeability determination is crucial in hydrocarbon exploration and production.Conventional methods relying on empirical correlations and assumptions often result in high costs,time consumption,inaccuracies,and uncertainties.This study introduces a novel hybrid machine learning approach to predict the permeability of the Wangkwar formation in the Gunya oilfield,Northwestern Uganda.The group method of data handling with differential evolution(GMDH-DE)algorithm was used to predict permeability due to its capability to manage complex,nonlinear relationships between variables,reduced computation time,and parameter optimization through evolutionary algorithms.Using 1953 samples from Gunya-1 and Gunya-2 wells for training and 1563 samples from Gunya-3 for testing,the GMDH-DE outperformed the group method of data handling(GMDH)and random forest(RF)in predicting permeability with higher accuracy and lower computation time.The GMDH-DE achieved an R^(2)of 0.9985,RMSE of 3.157,MAE of 2.366,and ME of 0.001 during training,and for testing,the ME,MAE,RMSE,and R^(2)were 1.3508,12.503,21.3898,and 0.9534,respectively.Additionally,the GMDH-DE demonstrated a 41%reduction in processing time compared to GMDH and RF.The model was also used to predict the permeability of the Mita Gamma well in the Mandawa basin,Tanzania,which lacks core data.Shapley additive explanations(SHAP)analysis identified thermal neutron porosity(TNPH),effective porosity(PHIE),and spectral gamma-ray(SGR)as the most critical parameters in permeability prediction.Therefore,the GMDH-DE model offers a novel,efficient,and accurate approach for fast permeability prediction,enhancing hydrocarbon exploration and production.展开更多
Accurate prediction of shield tunneling-induced settlement is a complex problem that requires consideration of many influential parameters.Recent studies reveal that machine learning(ML)algorithms can predict the sett...Accurate prediction of shield tunneling-induced settlement is a complex problem that requires consideration of many influential parameters.Recent studies reveal that machine learning(ML)algorithms can predict the settlement caused by tunneling.However,well-performing ML models are usually less interpretable.Irrelevant input features decrease the performance and interpretability of an ML model.Nonetheless,feature selection,a critical step in the ML pipeline,is usually ignored in most studies that focused on predicting tunneling-induced settlement.This study applies four techniques,i.e.Pearson correlation method,sequential forward selection(SFS),sequential backward selection(SBS)and Boruta algorithm,to investigate the effect of feature selection on the model’s performance when predicting the tunneling-induced maximum surface settlement(S_(max)).The data set used in this study was compiled from two metro tunnel projects excavated in Hangzhou,China using earth pressure balance(EPB)shields and consists of 14 input features and a single output(i.e.S_(max)).The ML model that is trained on features selected from the Boruta algorithm demonstrates the best performance in both the training and testing phases.The relevant features chosen from the Boruta algorithm further indicate that tunneling-induced settlement is affected by parameters related to tunnel geometry,geological conditions and shield operation.The recently proposed Shapley additive explanations(SHAP)method explores how the input features contribute to the output of a complex ML model.It is observed that the larger settlements are induced during shield tunneling in silty clay.Moreover,the SHAP analysis reveals that the low magnitudes of face pressure at the top of the shield increase the model’s output。展开更多
Color has emerged as a pivotal factor influencing consumer purchasing decisions in the dried herbal medicine market.To address the issue of significant discoloration of Rhubarb(Rheum rhabarbarum L.)during the drying p...Color has emerged as a pivotal factor influencing consumer purchasing decisions in the dried herbal medicine market.To address the issue of significant discoloration of Rhubarb(Rheum rhabarbarum L.)during the drying process,this study investigates the effects of microwave fixation(MF)and hot-air fixation(HAF)pretreatment methods on the drying characteristics and quality of Rhubarb by ultrasonic synergistic vacuum far-infrared drying(U-VFID),with a primary focus on its color attributes.The results indicate that fixation pretreatment significantly enhances both drying efficiency and product quality,particularly in terms of color retention.Compared to unpretreated Rhubarb,the best comprehensive drying effect was achieved with MF treatment at 60℃for 7 min,which resulted in a 441.18%increase in rhein content,a 58.57%reduction in drying time,and a 48.38%decrease in theΔE value.Furthermore,correlation analysis,and the eXtreme Gradient Boosting(XGBoost)algorithm combined with SHapley Additive exPlanations(SHAP)model,revealed that the color of Rhubarb subjected to various fixation pretreatments in conjunction with U-VFID is primarily influenced by sennoside A content,total phenolic content(TPC),and drying time.This study offers a scientific foundation and theoretical insights for optimizing the quality of dried medicinal plant products and introduces innovative approaches for post-harvest industrial pretreatment of rhizomatous medicinal plants.展开更多
Accurately revealing the spatial heterogeneity in the trade-offs and synergies of land use functions(LUFs)and their driving factors is imperative for advancing sustainable land utilization and optimizing land use plan...Accurately revealing the spatial heterogeneity in the trade-offs and synergies of land use functions(LUFs)and their driving factors is imperative for advancing sustainable land utilization and optimizing land use planning.This is especially critical for ecologically vulnerable inland river basins in arid regions.However,existing methods struggle to effectively capture complex nonlinear interactions among environmental factors and their multifaceted relationships with trade-offs and synergies of LUFs,especially for the inland river basins in arid regions.Consequently,this study focused on the middle reaches of the Heihe River Basin(MHRB),an arid inland river basin in northwestern China.Using land use,socioeconomic,meteorological,and hydrological data from 2000 to 2020,we analyzed the spatiotemporal patterns of LUFs and their trade-off and synergy relationships from the perspective of production,living,ecological functions.Additionally,we employed an integrated Extreme Gradient Boosting(XGBoost)-SHapley Additive exPlanations(SHAP)framework to investigate the environmental factors influencing the spatial heterogeneity in the trade-offs and synergies of LUFs.Our findings reveal that from 2000 to 2020,the production,living,and ecological functions of land use within the MHRB exhibited an increasing trend,demonstrating a distinct spatial pattern of''high in the southwest and low in the northeast''.Significant spatial heterogeneity defined the trade-off and synergistic relationships,with trade-offs dominating human activity-intensive oasis areas,while synergies prevailed in other areas.During the study period,synergistic relationships between production and living functions and between production and ecological functions were relatively robust,whereas synergies in living-ecological functions remained weaker.Natural factors(digital elevation model(DEM),annual mean temperature,Normalized Difference Vegetation Index(NDVI),and annual precipitation)emerged as the primary factors driving the trade-offs and synergies of LUFs,followed by socioeconomic factors(population density,Gross Domestic Product(GDP),and land use intensity),while distance factors(distance to water bodies,distance to residential areas,and distance to roads)exerted minimal influence.Notably,the interactions among NDVI,annual mean temperature,DEM,and land use intensity exerted the most substantial impacts on the relationships among LUFs.This study provides novel perspectives and methodologies for unraveling the mechanisms underlying the spatial heterogeneity in the trade-offs and synergies of LUFs,offering scientific insights to inform regional land use planning and sustainable natural resource management in inland river basins in arid regions.展开更多
This study addresses gaps in aftershock prediction research by proposing an interpretable hybrid machine learning model that leverages multi-source data.The model overcomes challenges related to the selection of influ...This study addresses gaps in aftershock prediction research by proposing an interpretable hybrid machine learning model that leverages multi-source data.The model overcomes challenges related to the selection of influencing factors,model types,prediction result visualization,and decision mechanism interpretability.It integrates mainshock factors,geological features,site characteristics,and terrain conditions using geospatial information system(GIS)technology.By employing the stacking algorithm to optimize and combine XGBoost and LightGBM models,the proposed model significantly improves the prediction performance.Visualization through aftershock hazard mapping offers a robust tool for aftershock warning.The Shapley additive explanations(SHAP)model is used to explain the decision-making process from both global and local perspectives.Results show that,compared to the optimized XGBoost-CMA_ES and LightGBM-CMA_ES hybrid models,the stacking model achieves area under the curve(AUC)increases of 7.71%and 5.72% on the test set,respectively,with a maximum prediction accuracy of 0.9344.The hazard zoning map identifies high-risk areas mainly around fault lines and near the epicenter.As hazard levels rise,the proportion and density of aftershocks in these areas increase.The SHAP model results highlight the distance to fault as the most critical factor.The study integrates local explanations with on-site investigations,effectively visualizing the contributions of different factors to aftershocks.This research provides new tools and methods for enhancing aftershock warning and response.展开更多
基金supported by the National Natural Science Foundation of China Project(No.62302540)please visit their website at https://www.nsfc.gov.cn/(accessed on 18 June 2024).
文摘The methods of network attacks have become increasingly sophisticated,rendering traditional cybersecurity defense mechanisms insufficient to address novel and complex threats effectively.In recent years,artificial intelligence has achieved significant progress in the field of network security.However,many challenges and issues remain,particularly regarding the interpretability of deep learning and ensemble learning algorithms.To address the challenge of enhancing the interpretability of network attack prediction models,this paper proposes a method that combines Light Gradient Boosting Machine(LGBM)and SHapley Additive exPlanations(SHAP).LGBM is employed to model anomalous fluctuations in various network indicators,enabling the rapid and accurate identification and prediction of potential network attack types,thereby facilitating the implementation of timely defense measures,the model achieved an accuracy of 0.977,precision of 0.985,recall of 0.975,and an F1 score of 0.979,demonstrating better performance compared to other models in the domain of network attack prediction.SHAP is utilized to analyze the black-box decision-making process of the model,providing interpretability by quantifying the contribution of each feature to the prediction results and elucidating the relationships between features.The experimental results demonstrate that the network attack predictionmodel based on LGBM exhibits superior accuracy and outstanding predictive capabilities.Moreover,the SHAP-based interpretability analysis significantly improves the model’s transparency and interpretability.
基金supported by the Major National Science and Technology Programs in the“Thirteenth Five-Year”Plan period(Grant No.2017ZX05032-002-004)the Innovation Team Funding of Natural Science Foundation of Hubei Province,China(Grant No.2021CFA031)the Chinese Scholarship Council(CSC)and Silk Road Institute for their support in terms of stipend.
文摘Accurate reservoir permeability determination is crucial in hydrocarbon exploration and production.Conventional methods relying on empirical correlations and assumptions often result in high costs,time consumption,inaccuracies,and uncertainties.This study introduces a novel hybrid machine learning approach to predict the permeability of the Wangkwar formation in the Gunya oilfield,Northwestern Uganda.The group method of data handling with differential evolution(GMDH-DE)algorithm was used to predict permeability due to its capability to manage complex,nonlinear relationships between variables,reduced computation time,and parameter optimization through evolutionary algorithms.Using 1953 samples from Gunya-1 and Gunya-2 wells for training and 1563 samples from Gunya-3 for testing,the GMDH-DE outperformed the group method of data handling(GMDH)and random forest(RF)in predicting permeability with higher accuracy and lower computation time.The GMDH-DE achieved an R^(2)of 0.9985,RMSE of 3.157,MAE of 2.366,and ME of 0.001 during training,and for testing,the ME,MAE,RMSE,and R^(2)were 1.3508,12.503,21.3898,and 0.9534,respectively.Additionally,the GMDH-DE demonstrated a 41%reduction in processing time compared to GMDH and RF.The model was also used to predict the permeability of the Mita Gamma well in the Mandawa basin,Tanzania,which lacks core data.Shapley additive explanations(SHAP)analysis identified thermal neutron porosity(TNPH),effective porosity(PHIE),and spectral gamma-ray(SGR)as the most critical parameters in permeability prediction.Therefore,the GMDH-DE model offers a novel,efficient,and accurate approach for fast permeability prediction,enhancing hydrocarbon exploration and production.
基金support provided by The Science and Technology Development Fund,Macao SAR,China(File Nos.0057/2020/AGJ and SKL-IOTSC-2021-2023)Science and Technology Program of Guangdong Province,China(Grant No.2021A0505080009).
文摘Accurate prediction of shield tunneling-induced settlement is a complex problem that requires consideration of many influential parameters.Recent studies reveal that machine learning(ML)algorithms can predict the settlement caused by tunneling.However,well-performing ML models are usually less interpretable.Irrelevant input features decrease the performance and interpretability of an ML model.Nonetheless,feature selection,a critical step in the ML pipeline,is usually ignored in most studies that focused on predicting tunneling-induced settlement.This study applies four techniques,i.e.Pearson correlation method,sequential forward selection(SFS),sequential backward selection(SBS)and Boruta algorithm,to investigate the effect of feature selection on the model’s performance when predicting the tunneling-induced maximum surface settlement(S_(max)).The data set used in this study was compiled from two metro tunnel projects excavated in Hangzhou,China using earth pressure balance(EPB)shields and consists of 14 input features and a single output(i.e.S_(max)).The ML model that is trained on features selected from the Boruta algorithm demonstrates the best performance in both the training and testing phases.The relevant features chosen from the Boruta algorithm further indicate that tunneling-induced settlement is affected by parameters related to tunnel geometry,geological conditions and shield operation.The recently proposed Shapley additive explanations(SHAP)method explores how the input features contribute to the output of a complex ML model.It is observed that the larger settlements are induced during shield tunneling in silty clay.Moreover,the SHAP analysis reveals that the low magnitudes of face pressure at the top of the shield increase the model’s output。
基金supported by the Young Mentor Fund project of Gansu Agricultural University[grant number 0522014]the Gansu Provincial Science and Technology Plan[grant number 23CXNA0017].
文摘Color has emerged as a pivotal factor influencing consumer purchasing decisions in the dried herbal medicine market.To address the issue of significant discoloration of Rhubarb(Rheum rhabarbarum L.)during the drying process,this study investigates the effects of microwave fixation(MF)and hot-air fixation(HAF)pretreatment methods on the drying characteristics and quality of Rhubarb by ultrasonic synergistic vacuum far-infrared drying(U-VFID),with a primary focus on its color attributes.The results indicate that fixation pretreatment significantly enhances both drying efficiency and product quality,particularly in terms of color retention.Compared to unpretreated Rhubarb,the best comprehensive drying effect was achieved with MF treatment at 60℃for 7 min,which resulted in a 441.18%increase in rhein content,a 58.57%reduction in drying time,and a 48.38%decrease in theΔE value.Furthermore,correlation analysis,and the eXtreme Gradient Boosting(XGBoost)algorithm combined with SHapley Additive exPlanations(SHAP)model,revealed that the color of Rhubarb subjected to various fixation pretreatments in conjunction with U-VFID is primarily influenced by sennoside A content,total phenolic content(TPC),and drying time.This study offers a scientific foundation and theoretical insights for optimizing the quality of dried medicinal plant products and introduces innovative approaches for post-harvest industrial pretreatment of rhizomatous medicinal plants.
基金funded by the University Teachers Innovation Fund Project of Gansu Province(2025A-001)the Northwest Normal University Young Teachers'Scientific Research Ability Improvement Plan(NWNULKQN2024-20).
文摘Accurately revealing the spatial heterogeneity in the trade-offs and synergies of land use functions(LUFs)and their driving factors is imperative for advancing sustainable land utilization and optimizing land use planning.This is especially critical for ecologically vulnerable inland river basins in arid regions.However,existing methods struggle to effectively capture complex nonlinear interactions among environmental factors and their multifaceted relationships with trade-offs and synergies of LUFs,especially for the inland river basins in arid regions.Consequently,this study focused on the middle reaches of the Heihe River Basin(MHRB),an arid inland river basin in northwestern China.Using land use,socioeconomic,meteorological,and hydrological data from 2000 to 2020,we analyzed the spatiotemporal patterns of LUFs and their trade-off and synergy relationships from the perspective of production,living,ecological functions.Additionally,we employed an integrated Extreme Gradient Boosting(XGBoost)-SHapley Additive exPlanations(SHAP)framework to investigate the environmental factors influencing the spatial heterogeneity in the trade-offs and synergies of LUFs.Our findings reveal that from 2000 to 2020,the production,living,and ecological functions of land use within the MHRB exhibited an increasing trend,demonstrating a distinct spatial pattern of''high in the southwest and low in the northeast''.Significant spatial heterogeneity defined the trade-off and synergistic relationships,with trade-offs dominating human activity-intensive oasis areas,while synergies prevailed in other areas.During the study period,synergistic relationships between production and living functions and between production and ecological functions were relatively robust,whereas synergies in living-ecological functions remained weaker.Natural factors(digital elevation model(DEM),annual mean temperature,Normalized Difference Vegetation Index(NDVI),and annual precipitation)emerged as the primary factors driving the trade-offs and synergies of LUFs,followed by socioeconomic factors(population density,Gross Domestic Product(GDP),and land use intensity),while distance factors(distance to water bodies,distance to residential areas,and distance to roads)exerted minimal influence.Notably,the interactions among NDVI,annual mean temperature,DEM,and land use intensity exerted the most substantial impacts on the relationships among LUFs.This study provides novel perspectives and methodologies for unraveling the mechanisms underlying the spatial heterogeneity in the trade-offs and synergies of LUFs,offering scientific insights to inform regional land use planning and sustainable natural resource management in inland river basins in arid regions.
基金supported by the National Key Research and Development Program of China(Grant No.2023YFC3007203).
文摘This study addresses gaps in aftershock prediction research by proposing an interpretable hybrid machine learning model that leverages multi-source data.The model overcomes challenges related to the selection of influencing factors,model types,prediction result visualization,and decision mechanism interpretability.It integrates mainshock factors,geological features,site characteristics,and terrain conditions using geospatial information system(GIS)technology.By employing the stacking algorithm to optimize and combine XGBoost and LightGBM models,the proposed model significantly improves the prediction performance.Visualization through aftershock hazard mapping offers a robust tool for aftershock warning.The Shapley additive explanations(SHAP)model is used to explain the decision-making process from both global and local perspectives.Results show that,compared to the optimized XGBoost-CMA_ES and LightGBM-CMA_ES hybrid models,the stacking model achieves area under the curve(AUC)increases of 7.71%and 5.72% on the test set,respectively,with a maximum prediction accuracy of 0.9344.The hazard zoning map identifies high-risk areas mainly around fault lines and near the epicenter.As hazard levels rise,the proportion and density of aftershocks in these areas increase.The SHAP model results highlight the distance to fault as the most critical factor.The study integrates local explanations with on-site investigations,effectively visualizing the contributions of different factors to aftershocks.This research provides new tools and methods for enhancing aftershock warning and response.