This study developed a modeling methodology for statistical optimization-based geologic hazard susceptibility assessment,aiming to enhance the comprehensive performance and classification accuracy of the assessment mo...This study developed a modeling methodology for statistical optimization-based geologic hazard susceptibility assessment,aiming to enhance the comprehensive performance and classification accuracy of the assessment models.First,the cumulative probability method revealed that a low probability(15%)of geologic hazards between any two geologic hazard points occurred outside a buffer zone with a radius of 2297 m(i.e.,the distance threshold).The training dataset was established,consisting of negative samples(non-hazard points)randomly generated based on the distance threshold,positive samples(i.e.,historical hazards),and 13 conditioning factors.Then,models were built using five machine learning algorithms,namely random forest(RF),gradient boosting decision tree(GBDT),naive Bayes(NB),logistic regression(LR),and support vector machine(SVM).The comprehensive performance of the models was assessed using the area under the receiver operating characteristic curve(AUC)and overall accuracy(OA)as indicators,revealing that RF exhibited the best performance,with OA and AUC values of 2.7127 and 0.981,respectively.Furthermore,the machine learning models constructed by considering the distance threshold outperformed those built using the unoptimized dataset.The characteristic factors were ranked using the mutual information method,with their scores decreasing in the order of rainfall(0.1616),altitude(0.06),normalized difference vegetation index(NDVI;0.04),and distance from roads(0.03).Finally,the geologic hazard susceptibility classification was assessed using the natural breaks method combined with a clustering algorithm.The results indicate that the clustering algorithm exhibited higher classification accuracy than the natural breaks method.The findings of this study demonstrate that the proposed model optimization scheme can provide a scientific basis for the prevention and control of geologic hazards.展开更多
The integration of machine learning(ML)into geohazard assessment has successfully instigated a paradigm shift,leading to the production of models that possess a level of predictive accuracy previously considered unatt...The integration of machine learning(ML)into geohazard assessment has successfully instigated a paradigm shift,leading to the production of models that possess a level of predictive accuracy previously considered unattainable.However,the black-box nature of these systems presents a significant barrier,hindering their operational adoption,regulatory approval,and full scientific validation.This paper provides a systematic review and synthesis of the emerging field of explainable artificial intelligence(XAI)as applied to geohazard science(GeoXAI),a domain that aims to resolve the long-standing trade-off between model performance and interpretability.A rigorous synthesis of 87 foundational studies is used to map the intellectual and methodological contours of this rapidly expanding field.The analysis reveals that current research efforts are concentrated predominantly on landslide and flood assessment.Methodologically,tree-based ensembles and deep learning models dominate the literature,with SHapley Additive exPlanations(SHAP)frequently adopted as the principal post-hoc explanation technique.More importantly,the review further documents how the role of XAI has shifted:rather than being used solely as a tool for interpreting models after training,it is increasingly integrated into the modeling cycle itself.Recent applications include its use in feature selection,adaptive sampling strategies,and model evaluation.The evidence also shows that GeoXAI extends beyond producing feature rankings.It reveals nonlinear thresholds and interaction effects that generate deeper mechanistic insights into hazard processes and mechanisms.Nevertheless,several key challenges remain unresolved within the field.These persistent issues are especially pronounced when considering the crucial necessity for interpretation stability,the demanding scholarly task of reliably distinguishing correlation from causation,and the development of appropriate methods for the treatment of complex spatio-temporal dynamics.展开更多
基金supported by a project entitled Loess Plateau Region-Watershed-Slope Geological Hazard Multi-Scale Collaborative Intelligent Early Warning System of the National Key R&D Program of China(2022YFC3003404)a project of the Shaanxi Youth Science and Technology Star(2021KJXX-87)public welfare geological survey projects of Shaanxi Institute of Geologic Survey(20180301,201918,202103,and 202413)。
文摘This study developed a modeling methodology for statistical optimization-based geologic hazard susceptibility assessment,aiming to enhance the comprehensive performance and classification accuracy of the assessment models.First,the cumulative probability method revealed that a low probability(15%)of geologic hazards between any two geologic hazard points occurred outside a buffer zone with a radius of 2297 m(i.e.,the distance threshold).The training dataset was established,consisting of negative samples(non-hazard points)randomly generated based on the distance threshold,positive samples(i.e.,historical hazards),and 13 conditioning factors.Then,models were built using five machine learning algorithms,namely random forest(RF),gradient boosting decision tree(GBDT),naive Bayes(NB),logistic regression(LR),and support vector machine(SVM).The comprehensive performance of the models was assessed using the area under the receiver operating characteristic curve(AUC)and overall accuracy(OA)as indicators,revealing that RF exhibited the best performance,with OA and AUC values of 2.7127 and 0.981,respectively.Furthermore,the machine learning models constructed by considering the distance threshold outperformed those built using the unoptimized dataset.The characteristic factors were ranked using the mutual information method,with their scores decreasing in the order of rainfall(0.1616),altitude(0.06),normalized difference vegetation index(NDVI;0.04),and distance from roads(0.03).Finally,the geologic hazard susceptibility classification was assessed using the natural breaks method combined with a clustering algorithm.The results indicate that the clustering algorithm exhibited higher classification accuracy than the natural breaks method.The findings of this study demonstrate that the proposed model optimization scheme can provide a scientific basis for the prevention and control of geologic hazards.
文摘The integration of machine learning(ML)into geohazard assessment has successfully instigated a paradigm shift,leading to the production of models that possess a level of predictive accuracy previously considered unattainable.However,the black-box nature of these systems presents a significant barrier,hindering their operational adoption,regulatory approval,and full scientific validation.This paper provides a systematic review and synthesis of the emerging field of explainable artificial intelligence(XAI)as applied to geohazard science(GeoXAI),a domain that aims to resolve the long-standing trade-off between model performance and interpretability.A rigorous synthesis of 87 foundational studies is used to map the intellectual and methodological contours of this rapidly expanding field.The analysis reveals that current research efforts are concentrated predominantly on landslide and flood assessment.Methodologically,tree-based ensembles and deep learning models dominate the literature,with SHapley Additive exPlanations(SHAP)frequently adopted as the principal post-hoc explanation technique.More importantly,the review further documents how the role of XAI has shifted:rather than being used solely as a tool for interpreting models after training,it is increasingly integrated into the modeling cycle itself.Recent applications include its use in feature selection,adaptive sampling strategies,and model evaluation.The evidence also shows that GeoXAI extends beyond producing feature rankings.It reveals nonlinear thresholds and interaction effects that generate deeper mechanistic insights into hazard processes and mechanisms.Nevertheless,several key challenges remain unresolved within the field.These persistent issues are especially pronounced when considering the crucial necessity for interpretation stability,the demanding scholarly task of reliably distinguishing correlation from causation,and the development of appropriate methods for the treatment of complex spatio-temporal dynamics.