Geotechnical engineering data are usually small-sample and high-dimensional,which brings a lot of challenges in predictive modeling.This paper uses a typical high-dimensional and small-sample swell pressure(P_(s))data...Geotechnical engineering data are usually small-sample and high-dimensional,which brings a lot of challenges in predictive modeling.This paper uses a typical high-dimensional and small-sample swell pressure(P_(s))dataset to explore the possibility of using multi-algorithm hybrid ensemble and dimensionality reduction methods to mitigate the uncertainty of soil parameter prediction.Based on six machine learning(ML)algorithms,the base learner pool is constructed,and four ensemble methods,Stacking(SG),Blending(BG),Voting regression(VR),and Feature weight linear stacking(FWL),are used for the multi-algorithm ensemble.Furthermore,the importance of permutation is used for feature dimensionality reduction to mitigate the impact of weakly correlated variables on predictive modeling.The results show that the proposed methods are superior to traditional prediction models and base ML models,where FWL is more suitable for modeling with small-sample datasets,and dimensionality reduction can simplify the data structure and reduce the adverse impact of the small-sample effect,which points the way to feature selection for predictive modeling.Based on the ensemble methods,the feature importance of the five primary factors affecting P_(s) is the maximum dry density(31.145%),clay fraction(15.876%),swell percent(15.289%),plasticity index(14%),and optimum moisture content(13.69%),the influence of input parameters on P_(s) is also investigated,in line with the findings of the existing literature.展开更多
Based on data from the Jilin Water Diversion Tunnels from the Songhua River(China),an improved and real-time prediction method optimized by multi-algorithm for tunnel boring machine(TBM)cutter-head torque is presented...Based on data from the Jilin Water Diversion Tunnels from the Songhua River(China),an improved and real-time prediction method optimized by multi-algorithm for tunnel boring machine(TBM)cutter-head torque is presented.Firstly,a function excluding invalid and abnormal data is established to distinguish TBM operating state,and a feature selection method based on the SelectKBest algorithm is proposed.Accordingly,ten features that are most closely related to the cutter-head torque are selected as input variables,which,in descending order of influence,include the sum of motor torque,cutter-head power,sum of motor power,sum of motor current,advance rate,cutter-head pressure,total thrust force,penetration rate,cutter-head rotational velocity,and field penetration index.Secondly,a real-time cutterhead torque prediction model’s structure is developed,based on the bidirectional long short-term memory(BLSTM)network integrating the dropout algorithm to prevent overfitting.Then,an algorithm to optimize hyperparameters of model based on Bayesian and cross-validation is proposed.Early stopping and checkpoint algorithms are integrated to optimize the training process.Finally,a BLSTMbased real-time cutter-head torque prediction model is developed,which fully utilizes the previous time-series tunneling information.The mean absolute percentage error(MAPE)of the model in the verification section is 7.3%,implying that the presented model is suitable for real-time cutter-head torque prediction.Furthermore,an incremental learning method based on the above base model is introduced to improve the adaptability of the model during the TBM tunneling.Comparison of the prediction performance between the base and incremental learning models in the same tunneling section shows that:(1)the MAPE of the predicted results of the BLSTM-based real-time cutter-head torque prediction model remains below 10%,and both the coefficient of determination(R^(2))and correlation coefficient(r)between measured and predicted values exceed 0.95;and(2)the incremental learning method is suitable for realtime cutter-head torque prediction and can effectively improve the prediction accuracy and generalization capacity of the model during the excavation process.展开更多
基金great gratitude to National Key Research and Development Project(Grant No.2019YFC1509800)for their financial supportNational Nature Science Foundation of China(Grant No.12172211)for their financial support.
文摘Geotechnical engineering data are usually small-sample and high-dimensional,which brings a lot of challenges in predictive modeling.This paper uses a typical high-dimensional and small-sample swell pressure(P_(s))dataset to explore the possibility of using multi-algorithm hybrid ensemble and dimensionality reduction methods to mitigate the uncertainty of soil parameter prediction.Based on six machine learning(ML)algorithms,the base learner pool is constructed,and four ensemble methods,Stacking(SG),Blending(BG),Voting regression(VR),and Feature weight linear stacking(FWL),are used for the multi-algorithm ensemble.Furthermore,the importance of permutation is used for feature dimensionality reduction to mitigate the impact of weakly correlated variables on predictive modeling.The results show that the proposed methods are superior to traditional prediction models and base ML models,where FWL is more suitable for modeling with small-sample datasets,and dimensionality reduction can simplify the data structure and reduce the adverse impact of the small-sample effect,which points the way to feature selection for predictive modeling.Based on the ensemble methods,the feature importance of the five primary factors affecting P_(s) is the maximum dry density(31.145%),clay fraction(15.876%),swell percent(15.289%),plasticity index(14%),and optimum moisture content(13.69%),the influence of input parameters on P_(s) is also investigated,in line with the findings of the existing literature.
基金financially supported by the National Natural Science Foundation of China (Grant Nos. 52074258, 41941018, and U21A20153)
文摘Based on data from the Jilin Water Diversion Tunnels from the Songhua River(China),an improved and real-time prediction method optimized by multi-algorithm for tunnel boring machine(TBM)cutter-head torque is presented.Firstly,a function excluding invalid and abnormal data is established to distinguish TBM operating state,and a feature selection method based on the SelectKBest algorithm is proposed.Accordingly,ten features that are most closely related to the cutter-head torque are selected as input variables,which,in descending order of influence,include the sum of motor torque,cutter-head power,sum of motor power,sum of motor current,advance rate,cutter-head pressure,total thrust force,penetration rate,cutter-head rotational velocity,and field penetration index.Secondly,a real-time cutterhead torque prediction model’s structure is developed,based on the bidirectional long short-term memory(BLSTM)network integrating the dropout algorithm to prevent overfitting.Then,an algorithm to optimize hyperparameters of model based on Bayesian and cross-validation is proposed.Early stopping and checkpoint algorithms are integrated to optimize the training process.Finally,a BLSTMbased real-time cutter-head torque prediction model is developed,which fully utilizes the previous time-series tunneling information.The mean absolute percentage error(MAPE)of the model in the verification section is 7.3%,implying that the presented model is suitable for real-time cutter-head torque prediction.Furthermore,an incremental learning method based on the above base model is introduced to improve the adaptability of the model during the TBM tunneling.Comparison of the prediction performance between the base and incremental learning models in the same tunneling section shows that:(1)the MAPE of the predicted results of the BLSTM-based real-time cutter-head torque prediction model remains below 10%,and both the coefficient of determination(R^(2))and correlation coefficient(r)between measured and predicted values exceed 0.95;and(2)the incremental learning method is suitable for realtime cutter-head torque prediction and can effectively improve the prediction accuracy and generalization capacity of the model during the excavation process.