Casing damage resulting from sand production in unconsolidated sandstone reservoirs can significantly impact the average production of oil wells.However,the prediction task remains challenging due to the complex damag...Casing damage resulting from sand production in unconsolidated sandstone reservoirs can significantly impact the average production of oil wells.However,the prediction task remains challenging due to the complex damage mechanism caused by sand production.This paper presents an innovative approach that combines feature selection(FS)with boosting algorithms to accurately predict casing damage in unconsolidated sandstone reservoirs.A novel TriScore FS technique is developed,combining mRMR,Random Forest,and F-test.The approach integrates three distinct feature selection approaches—TriScore,wrapper,and hybrid TriScore-wrapper and four interpretable Boosting models(AdaBoost,XGBoost,LightGBM,CatBoost).Moreover,shapley additive explanations(SHAP)was used to identify the most significant features across engineering,geological,and production features.The CatBoost model,using the Hybrid TriScore-rapper G_(1)G_(2)FS method,showed exceptional performance in analyzing data from the Gangxi Oilfield.It achieved the highestaccuracy(95.5%)and recall rate(89.7%)compared to other tested models.Casing service time,casing wall thickness,and perforation density were selected as the top three most important features.This framework enhances predictive robustness and is an effective tool for policymakers and energy analysts,confirming its capability to deliver reliable casing damage forecasts.展开更多
Boosting algorithms have been widely utilized in the development of landslide susceptibility mapping(LSM)studies.However,these algorithms possess distinct computational strategies and hyperparameters,making it challen...Boosting algorithms have been widely utilized in the development of landslide susceptibility mapping(LSM)studies.However,these algorithms possess distinct computational strategies and hyperparameters,making it challenging to propose an ideal LSM model.To investigate the impact of different boosting algorithms and hyperparameter optimization algorithms on LSM,this study constructed a geospatial database comprising 12 conditioning factors,such as elevation,stratum,and annual average rainfall.The XGBoost(XGB),LightGBM(LGBM),and CatBoost(CB)algorithms were employed to construct the LSM model.Furthermore,the Bayesian optimization(BO),particle swarm optimization(PSO),and Hyperband optimization(HO)algorithms were applied to optimizing the LSM model.The boosting algorithms exhibited varying performances,with CB demonstrating the highest precision,followed by LGBM,and XGB showing poorer precision.Additionally,the hyperparameter optimization algorithms displayed different performances,with HO outperforming PSO and BO showing poorer performance.The HO-CB model achieved the highest precision,boasting an accuracy of 0.764,an F1-score of 0.777,an area under the curve(AUC)value of 0.837 for the training set,and an AUC value of 0.863 for the test set.The model was interpreted using SHapley Additive exPlanations(SHAP),revealing that slope,curvature,topographic wetness index(TWI),degree of relief,and elevation significantly influenced landslides in the study area.This study offers a scientific reference for LSM and disaster prevention research.This study examines the utilization of various boosting algorithms and hyperparameter optimization algorithms in Wanzhou District.It proposes the HO-CB-SHAP framework as an effective approach to accurately forecast landslide disasters and interpret LSM models.However,limitations exist concerning the generalizability of the model and the data processing,which require further exploration in subsequent studies.展开更多
基金funded by the National Natural Science Foundation Project(Grant No.52274015)the National Science and Technology Major Project(Grant No.2025ZD1402205)。
文摘Casing damage resulting from sand production in unconsolidated sandstone reservoirs can significantly impact the average production of oil wells.However,the prediction task remains challenging due to the complex damage mechanism caused by sand production.This paper presents an innovative approach that combines feature selection(FS)with boosting algorithms to accurately predict casing damage in unconsolidated sandstone reservoirs.A novel TriScore FS technique is developed,combining mRMR,Random Forest,and F-test.The approach integrates three distinct feature selection approaches—TriScore,wrapper,and hybrid TriScore-wrapper and four interpretable Boosting models(AdaBoost,XGBoost,LightGBM,CatBoost).Moreover,shapley additive explanations(SHAP)was used to identify the most significant features across engineering,geological,and production features.The CatBoost model,using the Hybrid TriScore-rapper G_(1)G_(2)FS method,showed exceptional performance in analyzing data from the Gangxi Oilfield.It achieved the highestaccuracy(95.5%)and recall rate(89.7%)compared to other tested models.Casing service time,casing wall thickness,and perforation density were selected as the top three most important features.This framework enhances predictive robustness and is an effective tool for policymakers and energy analysts,confirming its capability to deliver reliable casing damage forecasts.
基金funded by the Natural Science Foundation of Chongqing(Grants No.CSTB2022NSCQ-MSX0594)the Humanities and Social Sciences Research Project of the Ministry of Education(Grants No.16YJCZH061).
文摘Boosting algorithms have been widely utilized in the development of landslide susceptibility mapping(LSM)studies.However,these algorithms possess distinct computational strategies and hyperparameters,making it challenging to propose an ideal LSM model.To investigate the impact of different boosting algorithms and hyperparameter optimization algorithms on LSM,this study constructed a geospatial database comprising 12 conditioning factors,such as elevation,stratum,and annual average rainfall.The XGBoost(XGB),LightGBM(LGBM),and CatBoost(CB)algorithms were employed to construct the LSM model.Furthermore,the Bayesian optimization(BO),particle swarm optimization(PSO),and Hyperband optimization(HO)algorithms were applied to optimizing the LSM model.The boosting algorithms exhibited varying performances,with CB demonstrating the highest precision,followed by LGBM,and XGB showing poorer precision.Additionally,the hyperparameter optimization algorithms displayed different performances,with HO outperforming PSO and BO showing poorer performance.The HO-CB model achieved the highest precision,boasting an accuracy of 0.764,an F1-score of 0.777,an area under the curve(AUC)value of 0.837 for the training set,and an AUC value of 0.863 for the test set.The model was interpreted using SHapley Additive exPlanations(SHAP),revealing that slope,curvature,topographic wetness index(TWI),degree of relief,and elevation significantly influenced landslides in the study area.This study offers a scientific reference for LSM and disaster prevention research.This study examines the utilization of various boosting algorithms and hyperparameter optimization algorithms in Wanzhou District.It proposes the HO-CB-SHAP framework as an effective approach to accurately forecast landslide disasters and interpret LSM models.However,limitations exist concerning the generalizability of the model and the data processing,which require further exploration in subsequent studies.