The bandgap is a key parameter for understanding and designing hybrid perovskite material properties,as well as developing photovoltaic devices.Traditional bandgap calculation methods like ultravioletvisible spectrosc...The bandgap is a key parameter for understanding and designing hybrid perovskite material properties,as well as developing photovoltaic devices.Traditional bandgap calculation methods like ultravioletvisible spectroscopy and first-principles calculations are time-and power-consuming,not to mention capturing bandgap change mechanisms for hybrid perovskite materials across a wide range of unknown space.In the present work,an artificial intelligence ensemble comprising two classifiers(with F1 scores of 0.9125 and 0.925)and a regressor(with mean squared error of 0.0014 eV)is constructed to achieve high-precision prediction of the bandgap.The bandgap perovskite dataset is established through highthroughput prediction of bandgaps by the ensemble.Based on the self-built dataset,partial dependence analysis(PDA)is developed to interpret the bandgap influential mechanism.Meanwhile,an interpretable mathematical model with an R^(2)of 0.8417 is generated using the genetic programming symbolic regression(GPSR)technique.The constructed PDA maps agree well with the Shapley Additive exPlanations,the GPSR model,and experiment verification.Through PDA,we reveal the boundary effect,the bowing effect,and their evolution trends with key descriptors.展开更多
The rapid advancement of three-dimensional printed concrete(3DPC)requires intelligent and interpretable frameworks to optimize mixture design for strength,printability,and sustainability.While machine learning(ML)mode...The rapid advancement of three-dimensional printed concrete(3DPC)requires intelligent and interpretable frameworks to optimize mixture design for strength,printability,and sustainability.While machine learning(ML)models have improved predictive accuracy,their limited transparency has hindered their widespread adoption in materials engineering.To overcome this barrier,this study introduces a Random Forests ensemble learning model integrated with SHapley Additive exPlanations(SHAP)and Partial Dependence Plots(PDPs)to model and explain the compressive strength behavior of 3DPC mixtures.Unlike conventional“black-box”models,SHAP quantifies each variable’s contribution to predictions based on cooperative game theory,which enables causal interpretability,whereas PDP visualizes nonlinear and interactive effects between features that offer practical mix design insights.A systematically optimized random forest model achieved strong generalization(R2=0.978 for training,0.834 for validation,and 0.868 for testing).The analysis identified curing age,Portland cement,silica fume,and the water-tobinder ratio as dominant predictors,with curing age exerting the highest positive influence on strength development.The integrated SHAP-PDP framework revealed synergistic interactions among binder constituents and curing parameters,which established transparent,data-driven guidelines for performance optimization.Theoretically,the study advances explainable artificial intelligence in cementitious material science by linking microstructural mechanisms to model-based reasoning,thereby enhancing both the interpretability and applicability of ML-driven mix design for next-generation 3DPC systems.展开更多
We introduce the Kernel-based Partial Conditional Mean Dependence,a sca-lar-valued measure of conditional mean dependence of Y given X,while adjusting for the nonlinear dependence on Z.Here X,Y and Z are random elemen...We introduce the Kernel-based Partial Conditional Mean Dependence,a sca-lar-valued measure of conditional mean dependence of Y given X,while adjusting for the nonlinear dependence on Z.Here X,Y and Z are random elements from arbitrary separable Hilbert spaces.This measure ex-tends the Kernel-based Conditional Mean Dependence.As the estimator of the measure is developed,the concentration property of the estimator is proved.Numerical results demonstrate the effectiveness of the new dependence meas-ure in the context of dependence testing,highlighting their advantages in cap-turing nonlinear partial conditional mean dependencies.展开更多
Rock bursts represent a formidable challenge in underground engineering,posing substantial risks to both infrastructure and human safety.These sudden and violent failures of rock masses are characterized by the rapid ...Rock bursts represent a formidable challenge in underground engineering,posing substantial risks to both infrastructure and human safety.These sudden and violent failures of rock masses are characterized by the rapid release of accumulated stress within the rock,leading to severe seismic events and structural damage.Therefore,the development of reliable prediction models for rock bursts is paramount to mitigating these hazards.This study aims to propose a tree-based model—a Light Gradient Boosting Machine(LightGBM)—to predict the intensity of rock bursts in underground engineering.322 actual rock burst cases are collected to constitute an exhaustive rock burst dataset,which serves to train the LightGBMmodel.Two population-basedmetaheuristic algorithms are used to optimize the hyperparameters of the LightGBM model.Finally,the sensitivity analysis is used to identify the predominant factors that may incur the occurrence of rock bursts.The results show that the population-based metaheuristic algorithms have a good ability to search out the optimal hyperparameters of the LightGBM model.The developed LightGBM model yields promising performance in predicting the intensity of rock bursts,with which accuracy on training and testing sets are 0.972 and 0.944,respectively.The sensitivity analysis discloses that the risk of occurring rock burst is significantly sensitive to three factors:uniaxial compressive strength(σc),stress concentration factor(SCF),and elastic strain energy index(Wet).Moreover,this study clarifies the particular impact of these three factors on the intensity of rock bursts through the partial dependence plot.展开更多
Rock slope along motorways in the Higher Himalayan terrains are prone to various types of failure.In order to effectively mitigate these failures,a thorough assessment of rock mass behavior is entailed.The present res...Rock slope along motorways in the Higher Himalayan terrains are prone to various types of failure.In order to effectively mitigate these failures,a thorough assessment of rock mass behavior is entailed.The present research employs and compares widely practiced geo-mechanical classification schemes viz.,RQD,RMR,SMR,Q-slope,and GSI.A 23 km road cut section,along Sangla to Chitkul route,in Higher Himalayan region(India)has been taken up for this work.Total of 18 locations were selected,and their slope and rockmass properties were examined.Afterwards,the most influencing parameters in RMR,SMR,and Q-Slope were evaluated through a machine learning algorithm,i.e.,Random Forest.For RMRbasic,about 83%of rock-slopes were designated in good condition and rest were of Fair quality.Evaluation of slope mass rating along all 18-locations highlighted eight-sites as partially unstable,six-sites as partially stable.Remaining four locations varied between,Very Bad to Bad slope-conditions,necessitating the installation of mechanical supports and redesign of slopes.For SMR classification,feature importance analysis revealed the predominance of F3 variable,RQD and intact rock strength.Q-Slope approach was incorporated to identify the most stable steepest angle of the examined locations.For Q-Slope rating,Jn and RQD were found to have the most influence in classification of the slopes.Three zones on the basis of GSI-scores have been identified in the study area,i.e.,A(6595),B(4555),and C(2535).This study highlights the application of multiple geomechanical classification schemes,demonstrating how each approach can complement the others.展开更多
Deep mixing,also known as deep stabilization,is a widely used ground improvement method in Nordic countries,particularly in urban and infrastructural projects,aiming to enhance the properties of soft,sensitive clays.U...Deep mixing,also known as deep stabilization,is a widely used ground improvement method in Nordic countries,particularly in urban and infrastructural projects,aiming to enhance the properties of soft,sensitive clays.Understanding the shear strength of stabilized soils and identifying key influencing factors are essential for ensuring the structural stability and durability of engineering structures.This study introduces a novel explainable artificial intelligence framework to investigate critical soil properties affecting shear strength,utilizing a data set derived from stabilization tests conducted on laboratory samples from the 1990s.The proposed framework investigates the statistical variability and distribution of crucial parameters affecting shear strength within the collected data set.Subsequently,machine learning models are trained and tested to predict soil shear strength based on input features such as water/binder ratio and water content,etc.Global model analysis using feature importance and Shapley additive explanations is conducted to understand the influence of soil input features on shear strength.Further exploration is carried out using partial dependence plots,individual conditional expectation plots,and accumulated local effects to uncover the degree of dependency and important thresholds between key stabilized soil parameters and shear strength.Heat map and feature interaction analysis techniques are then utilized to investigate soil properties interactions and correlations.Lastly,a more specific investigation is conducted on particular soil samples to highlight the most influential soil properties locally,employing the local interpretable model-agnostic explanations technique.The validation of the framework involves analyzing laboratory test results obtained from uniaxial compression tests.The framework demonstrates an ability to predict the shear strength of stabilized soil samples with an accuracy surpassing 90%.Importantly,the explainability results underscore the substantial impact of water content and the water/binder ratio on shear strength.展开更多
Landslide inventory is an indispensable output variable of landslide susceptibility prediction(LSP)modelling.However,the influence of landslide inventory incompleteness on LSP and the transfer rules of LSP resulting e...Landslide inventory is an indispensable output variable of landslide susceptibility prediction(LSP)modelling.However,the influence of landslide inventory incompleteness on LSP and the transfer rules of LSP resulting error in the model have not been explored.Adopting Xunwu County,China,as an example,the existing landslide inventory is first obtained and assumed to contain all landslide inventory samples under ideal conditions,after which different landslide inventory sample missing conditions are simulated by random sampling.It includes the condition that the landslide inventory samples in the whole study area are missing randomly at the proportions of 10%,20%,30%,40%and 50%,as well as the condition that the landslide inventory samples in the south of Xunwu County are missing in aggregation.Then,five machine learning models,namely,Random Forest(RF),and Support Vector Machine(SVM),are used to perform LSP.Finally,the LSP results are evaluated to analyze the LSP uncertainties under various conditions.In addition,this study introduces various interpretability methods of machine learning model to explore the changes in the decision basis of the RF model under various conditions.Results show that(1)randomly missing landslide inventory samples at certain proportions(10%–50%)may affect the LSP results for local areas.(2)Aggregation of missing landslide inventory samples may cause significant biases in LSP,particularly in areas where samples are missing.(3)When 50%of landslide samples are missing(either randomly or aggregated),the changes in the decision basis of the RF model are mainly manifested in two aspects:first,the importance ranking of environmental factors slightly differs;second,in regard to LSP modelling in the same test grid unit,the weights of individual model factors may drastically vary.展开更多
Knowledge of the factors influencing nutrient-limited subtropical maize yield and subsequent prediction is crucial for effective nutrientmanagement,maximizing profitability,ensuring food security,and promoting environ...Knowledge of the factors influencing nutrient-limited subtropical maize yield and subsequent prediction is crucial for effective nutrientmanagement,maximizing profitability,ensuring food security,and promoting environmental sustainability.Weanalyzed data fromnutrient omission plot trials(NOPTs)conducted in 324 farmers'fields across ten agroecological zones(AEZs)in the Eastern Indo-Gangetic Plains(EIGP)of Bangladesh to explain maize yield variability and identify variables controlling nutrient-limited yields.An additive main effect and multiplicative interaction(AMMI)model was used to explain maize yield variability with nutrient addition.Interpretable machine learning(ML)algorithms in automatic machine learning(AutoML)frameworks were subsequently used to predict attainable yield relative nutrient-limited yield(RY)and to rank variables that control RY.The stack-ensemble model was identified as the best-performing model for predicting RYs of N,P,and Zn.In contrast,deep learning outperformed all base learners for predicting RYK.The best model's square errors(RMSEs)were 0.122,0.105,0.123,and 0.104 for RY_(N),RY_(P),RY_(K),and RY_(Zn),respectively.The permutation-based feature importance technique identified soil pH as the most critical variable controlling RY_(N)and RY_(P).The RY_(K)showed lower in the eastern longitudinal direction.Soil N and Zn were associated with RYZn.The predicted median RY of N,P,K,and Zn,representing average soil fertility,was 0.51,0.84,0.87,and 0.97,accounting for 44,54,54,and 48%upland dry season crop area of Bangladesh,respectively.Efforts are needed to update databases cataloging variability in land type inundation classes,soil characteristics,and INS and combine them with farmers'crop management information to develop more precise nutrient guidelines for maize in the EIGP.展开更多
基金supported by the National Research Foundation of Korea(NRF)funded by the Korean government(MSIT)(Grant number:RS-2025-02316700,and RS-2025-00522430)the China Scholarship Council Program。
文摘The bandgap is a key parameter for understanding and designing hybrid perovskite material properties,as well as developing photovoltaic devices.Traditional bandgap calculation methods like ultravioletvisible spectroscopy and first-principles calculations are time-and power-consuming,not to mention capturing bandgap change mechanisms for hybrid perovskite materials across a wide range of unknown space.In the present work,an artificial intelligence ensemble comprising two classifiers(with F1 scores of 0.9125 and 0.925)and a regressor(with mean squared error of 0.0014 eV)is constructed to achieve high-precision prediction of the bandgap.The bandgap perovskite dataset is established through highthroughput prediction of bandgaps by the ensemble.Based on the self-built dataset,partial dependence analysis(PDA)is developed to interpret the bandgap influential mechanism.Meanwhile,an interpretable mathematical model with an R^(2)of 0.8417 is generated using the genetic programming symbolic regression(GPSR)technique.The constructed PDA maps agree well with the Shapley Additive exPlanations,the GPSR model,and experiment verification.Through PDA,we reveal the boundary effect,the bowing effect,and their evolution trends with key descriptors.
基金supported by the Ongoing Research Funding Program(Grant No.ORFFT-2025-025-4)at King Saud University,Riyadh,Saudi Arabia.The grant was awarded to Yassir M.Abbas。
文摘The rapid advancement of three-dimensional printed concrete(3DPC)requires intelligent and interpretable frameworks to optimize mixture design for strength,printability,and sustainability.While machine learning(ML)models have improved predictive accuracy,their limited transparency has hindered their widespread adoption in materials engineering.To overcome this barrier,this study introduces a Random Forests ensemble learning model integrated with SHapley Additive exPlanations(SHAP)and Partial Dependence Plots(PDPs)to model and explain the compressive strength behavior of 3DPC mixtures.Unlike conventional“black-box”models,SHAP quantifies each variable’s contribution to predictions based on cooperative game theory,which enables causal interpretability,whereas PDP visualizes nonlinear and interactive effects between features that offer practical mix design insights.A systematically optimized random forest model achieved strong generalization(R2=0.978 for training,0.834 for validation,and 0.868 for testing).The analysis identified curing age,Portland cement,silica fume,and the water-tobinder ratio as dominant predictors,with curing age exerting the highest positive influence on strength development.The integrated SHAP-PDP framework revealed synergistic interactions among binder constituents and curing parameters,which established transparent,data-driven guidelines for performance optimization.Theoretically,the study advances explainable artificial intelligence in cementitious material science by linking microstructural mechanisms to model-based reasoning,thereby enhancing both the interpretability and applicability of ML-driven mix design for next-generation 3DPC systems.
文摘We introduce the Kernel-based Partial Conditional Mean Dependence,a sca-lar-valued measure of conditional mean dependence of Y given X,while adjusting for the nonlinear dependence on Z.Here X,Y and Z are random elements from arbitrary separable Hilbert spaces.This measure ex-tends the Kernel-based Conditional Mean Dependence.As the estimator of the measure is developed,the concentration property of the estimator is proved.Numerical results demonstrate the effectiveness of the new dependence meas-ure in the context of dependence testing,highlighting their advantages in cap-turing nonlinear partial conditional mean dependencies.
文摘Rock bursts represent a formidable challenge in underground engineering,posing substantial risks to both infrastructure and human safety.These sudden and violent failures of rock masses are characterized by the rapid release of accumulated stress within the rock,leading to severe seismic events and structural damage.Therefore,the development of reliable prediction models for rock bursts is paramount to mitigating these hazards.This study aims to propose a tree-based model—a Light Gradient Boosting Machine(LightGBM)—to predict the intensity of rock bursts in underground engineering.322 actual rock burst cases are collected to constitute an exhaustive rock burst dataset,which serves to train the LightGBMmodel.Two population-basedmetaheuristic algorithms are used to optimize the hyperparameters of the LightGBM model.Finally,the sensitivity analysis is used to identify the predominant factors that may incur the occurrence of rock bursts.The results show that the population-based metaheuristic algorithms have a good ability to search out the optimal hyperparameters of the LightGBM model.The developed LightGBM model yields promising performance in predicting the intensity of rock bursts,with which accuracy on training and testing sets are 0.972 and 0.944,respectively.The sensitivity analysis discloses that the risk of occurring rock burst is significantly sensitive to three factors:uniaxial compressive strength(σc),stress concentration factor(SCF),and elastic strain energy index(Wet).Moreover,this study clarifies the particular impact of these three factors on the intensity of rock bursts through the partial dependence plot.
基金Anusandhan National Research Foundation(ANRF)(previously,Science and Engineering Research Board-SERB),India for the grant CRG/2022/002509.
文摘Rock slope along motorways in the Higher Himalayan terrains are prone to various types of failure.In order to effectively mitigate these failures,a thorough assessment of rock mass behavior is entailed.The present research employs and compares widely practiced geo-mechanical classification schemes viz.,RQD,RMR,SMR,Q-slope,and GSI.A 23 km road cut section,along Sangla to Chitkul route,in Higher Himalayan region(India)has been taken up for this work.Total of 18 locations were selected,and their slope and rockmass properties were examined.Afterwards,the most influencing parameters in RMR,SMR,and Q-Slope were evaluated through a machine learning algorithm,i.e.,Random Forest.For RMRbasic,about 83%of rock-slopes were designated in good condition and rest were of Fair quality.Evaluation of slope mass rating along all 18-locations highlighted eight-sites as partially unstable,six-sites as partially stable.Remaining four locations varied between,Very Bad to Bad slope-conditions,necessitating the installation of mechanical supports and redesign of slopes.For SMR classification,feature importance analysis revealed the predominance of F3 variable,RQD and intact rock strength.Q-Slope approach was incorporated to identify the most stable steepest angle of the examined locations.For Q-Slope rating,Jn and RQD were found to have the most influence in classification of the slopes.Three zones on the basis of GSI-scores have been identified in the study area,i.e.,A(6595),B(4555),and C(2535).This study highlights the application of multiple geomechanical classification schemes,demonstrating how each approach can complement the others.
文摘Deep mixing,also known as deep stabilization,is a widely used ground improvement method in Nordic countries,particularly in urban and infrastructural projects,aiming to enhance the properties of soft,sensitive clays.Understanding the shear strength of stabilized soils and identifying key influencing factors are essential for ensuring the structural stability and durability of engineering structures.This study introduces a novel explainable artificial intelligence framework to investigate critical soil properties affecting shear strength,utilizing a data set derived from stabilization tests conducted on laboratory samples from the 1990s.The proposed framework investigates the statistical variability and distribution of crucial parameters affecting shear strength within the collected data set.Subsequently,machine learning models are trained and tested to predict soil shear strength based on input features such as water/binder ratio and water content,etc.Global model analysis using feature importance and Shapley additive explanations is conducted to understand the influence of soil input features on shear strength.Further exploration is carried out using partial dependence plots,individual conditional expectation plots,and accumulated local effects to uncover the degree of dependency and important thresholds between key stabilized soil parameters and shear strength.Heat map and feature interaction analysis techniques are then utilized to investigate soil properties interactions and correlations.Lastly,a more specific investigation is conducted on particular soil samples to highlight the most influential soil properties locally,employing the local interpretable model-agnostic explanations technique.The validation of the framework involves analyzing laboratory test results obtained from uniaxial compression tests.The framework demonstrates an ability to predict the shear strength of stabilized soil samples with an accuracy surpassing 90%.Importantly,the explainability results underscore the substantial impact of water content and the water/binder ratio on shear strength.
基金the National Natural Science Foundation of China(Nos.42377164,41972280 and 42272326)National Natural Science Outstanding Youth Foundation of China(No.52222905)+1 种基金Natural Science Foundation of Jiangxi Province,China(No.20232BAB204091)Natural Science Foundation of Jiangxi Province,China(No.20232BAB204077).
文摘Landslide inventory is an indispensable output variable of landslide susceptibility prediction(LSP)modelling.However,the influence of landslide inventory incompleteness on LSP and the transfer rules of LSP resulting error in the model have not been explored.Adopting Xunwu County,China,as an example,the existing landslide inventory is first obtained and assumed to contain all landslide inventory samples under ideal conditions,after which different landslide inventory sample missing conditions are simulated by random sampling.It includes the condition that the landslide inventory samples in the whole study area are missing randomly at the proportions of 10%,20%,30%,40%and 50%,as well as the condition that the landslide inventory samples in the south of Xunwu County are missing in aggregation.Then,five machine learning models,namely,Random Forest(RF),and Support Vector Machine(SVM),are used to perform LSP.Finally,the LSP results are evaluated to analyze the LSP uncertainties under various conditions.In addition,this study introduces various interpretability methods of machine learning model to explore the changes in the decision basis of the RF model under various conditions.Results show that(1)randomly missing landslide inventory samples at certain proportions(10%–50%)may affect the LSP results for local areas.(2)Aggregation of missing landslide inventory samples may cause significant biases in LSP,particularly in areas where samples are missing.(3)When 50%of landslide samples are missing(either randomly or aggregated),the changes in the decision basis of the RF model are mainly manifested in two aspects:first,the importance ranking of environmental factors slightly differs;second,in regard to LSP modelling in the same test grid unit,the weights of individual model factors may drastically vary.
文摘Knowledge of the factors influencing nutrient-limited subtropical maize yield and subsequent prediction is crucial for effective nutrientmanagement,maximizing profitability,ensuring food security,and promoting environmental sustainability.Weanalyzed data fromnutrient omission plot trials(NOPTs)conducted in 324 farmers'fields across ten agroecological zones(AEZs)in the Eastern Indo-Gangetic Plains(EIGP)of Bangladesh to explain maize yield variability and identify variables controlling nutrient-limited yields.An additive main effect and multiplicative interaction(AMMI)model was used to explain maize yield variability with nutrient addition.Interpretable machine learning(ML)algorithms in automatic machine learning(AutoML)frameworks were subsequently used to predict attainable yield relative nutrient-limited yield(RY)and to rank variables that control RY.The stack-ensemble model was identified as the best-performing model for predicting RYs of N,P,and Zn.In contrast,deep learning outperformed all base learners for predicting RYK.The best model's square errors(RMSEs)were 0.122,0.105,0.123,and 0.104 for RY_(N),RY_(P),RY_(K),and RY_(Zn),respectively.The permutation-based feature importance technique identified soil pH as the most critical variable controlling RY_(N)and RY_(P).The RY_(K)showed lower in the eastern longitudinal direction.Soil N and Zn were associated with RYZn.The predicted median RY of N,P,K,and Zn,representing average soil fertility,was 0.51,0.84,0.87,and 0.97,accounting for 44,54,54,and 48%upland dry season crop area of Bangladesh,respectively.Efforts are needed to update databases cataloging variability in land type inundation classes,soil characteristics,and INS and combine them with farmers'crop management information to develop more precise nutrient guidelines for maize in the EIGP.