Casing damage resulting from sand production in unconsolidated sandstone reservoirs can significantly impact the average production of oil wells.However,the prediction task remains challenging due to the complex damag...Casing damage resulting from sand production in unconsolidated sandstone reservoirs can significantly impact the average production of oil wells.However,the prediction task remains challenging due to the complex damage mechanism caused by sand production.This paper presents an innovative approach that combines feature selection(FS)with boosting algorithms to accurately predict casing damage in unconsolidated sandstone reservoirs.A novel TriScore FS technique is developed,combining mRMR,Random Forest,and F-test.The approach integrates three distinct feature selection approaches—TriScore,wrapper,and hybrid TriScore-wrapper and four interpretable Boosting models(AdaBoost,XGBoost,LightGBM,CatBoost).Moreover,shapley additive explanations(SHAP)was used to identify the most significant features across engineering,geological,and production features.The CatBoost model,using the Hybrid TriScore-rapper G_(1)G_(2)FS method,showed exceptional performance in analyzing data from the Gangxi Oilfield.It achieved the highestaccuracy(95.5%)and recall rate(89.7%)compared to other tested models.Casing service time,casing wall thickness,and perforation density were selected as the top three most important features.This framework enhances predictive robustness and is an effective tool for policymakers and energy analysts,confirming its capability to deliver reliable casing damage forecasts.展开更多
The prediction of power grid faults based on meteorological factors is of great significance to reduce economic losses caused by power grid faults. However, the existing methods fail to effectively extract key feature...The prediction of power grid faults based on meteorological factors is of great significance to reduce economic losses caused by power grid faults. However, the existing methods fail to effectively extract key features and accurately predict fault types due to the complexity of meteorological factors and their nonlinear relationships. In response to these challenges, we propose the Feature-Enhanced XGBoost power grid fault prediction method (FE-XGBoost). Specifically, we first combine the gradient boosting decision tree and recursive feature elimination method to extract essential features from meteorological data. Then, we incorporate a piecewise linear chaotic map to enhance the optimization accuracy of the sparrow search algorithm. Finally, we construct an XGBoost-based model for the classification prediction of power grid meteorological faults and optimize the hyperparameters such as the optimal tree depth, optimal learning rate, and optimal number of iterations using an enhanced sparrow search algorithm. Experimental results demonstrate that our method outperforms the baseline models in predicting power grid faults accurately.展开更多
In this investigation,the Gradient Boosting(GB),Linear Regression(LR),Decision Tree(DT),and Voting algo-rithms were applied to predict the distribution pattern of Au geochemical data.Trace and indicator elements,inclu...In this investigation,the Gradient Boosting(GB),Linear Regression(LR),Decision Tree(DT),and Voting algo-rithms were applied to predict the distribution pattern of Au geochemical data.Trace and indicator elements,including Mo,Cu,Pb,Zn,Ag,Ni,Co,Mn,Fe,and As,were used with these machine learning algorithms(MLAs)to predict Au concentration values in the Doostbigloo porphyry Cu-Au-Mo mineralization area.The performance of the models was evaluated using the Mean Absolute Percentage Error(MAPE)and Root Mean Square Error(RMSE)metrics.The proposed ensemble Voting algorithm outperformed the other models,yielding more ac-curate predictions according to both metrics.The predicted data from the GB,LR,DT,and Voting MLAs were modeled using the Concentration-Area fractal method,and Au geochemical anomalies were mapped.To compare and validate the results,factors such as the location of the mineral deposits,their surface extent,and mineralization trend were considered.The results indicate that integrating hybrid MLAs with fractal modeling signifi-cantly improves geochemical prospectivity mapping.Among the four models,three(DT,GB,Voting)accurately identified both mineral deposits.The LR model,however,only identified Deposit I(central),and its mineralization trend diverged from the field data.The GB and Voting models produced similar results,with their final maps derived from fractal modeling showing the same anomalous areas.The anomaly boundaries identified by these two models are consistent with the two known reserves in the region.The results and plots related to prediction indicators and error rates for these two models also show high similarity,with lower error rates than the other models.Notably,the Voting model demonstrated superior performance in accurately delineating mineral deposit locations and identifying realistic mineralization trends while minimizing false anomalies.展开更多
文摘针对历史负荷特征提取困难所导致的短期电力负荷预测精度不高的问题,提出了基于堆叠泛化集成思想的逻辑斯谛灰狼优化-极限梯度提升-轻量级梯度提升机-门控循环单元(logistic grey wolf optimizer-extreme gradient boosting-light gradient boosting machine-gated recurrent unit, LGWO-XGBoost-LightGBM-GRU)的短期电力负荷预测算法。该算法首先使用逻辑斯谛映射对灰狼优化(grey wolf optimizer, GWO)算法进行改进得到LGWO算法,接着使用LGWO算法分别对XGBoost、LightGBM、GRU算法进行参数寻优,然后使用XGBoost、LightGBM算法对数据的不同特征进行初步提炼,最后将提炼的特征合并到历史负荷数据集中作为输入,并使用GRU进行最终的负荷预测,得到预测结果。以某工业园区的负荷预测为例进行验证,结果表明,该算法与最小二乘支持向量机(least squares support vector machines, LS-SVM)算法相比,均方根误差降低了68.85%,平均绝对误差降低了69.57%,平均绝对百分比误差降低了69.97%,决定系数提高了8.42%。该算法提高了短期电力负荷预测的精度。
基金funded by the National Natural Science Foundation Project(Grant No.52274015)the National Science and Technology Major Project(Grant No.2025ZD1402205)。
文摘Casing damage resulting from sand production in unconsolidated sandstone reservoirs can significantly impact the average production of oil wells.However,the prediction task remains challenging due to the complex damage mechanism caused by sand production.This paper presents an innovative approach that combines feature selection(FS)with boosting algorithms to accurately predict casing damage in unconsolidated sandstone reservoirs.A novel TriScore FS technique is developed,combining mRMR,Random Forest,and F-test.The approach integrates three distinct feature selection approaches—TriScore,wrapper,and hybrid TriScore-wrapper and four interpretable Boosting models(AdaBoost,XGBoost,LightGBM,CatBoost).Moreover,shapley additive explanations(SHAP)was used to identify the most significant features across engineering,geological,and production features.The CatBoost model,using the Hybrid TriScore-rapper G_(1)G_(2)FS method,showed exceptional performance in analyzing data from the Gangxi Oilfield.It achieved the highestaccuracy(95.5%)and recall rate(89.7%)compared to other tested models.Casing service time,casing wall thickness,and perforation density were selected as the top three most important features.This framework enhances predictive robustness and is an effective tool for policymakers and energy analysts,confirming its capability to deliver reliable casing damage forecasts.
基金supported by the Science and Technology Project of State Grid Jiangsu Electric Power Co.,Ltd.(Research on Power Meteorology Digitalization Application for Future Climate Scenarios and New Energy Operation Risks,J2023076).
文摘The prediction of power grid faults based on meteorological factors is of great significance to reduce economic losses caused by power grid faults. However, the existing methods fail to effectively extract key features and accurately predict fault types due to the complexity of meteorological factors and their nonlinear relationships. In response to these challenges, we propose the Feature-Enhanced XGBoost power grid fault prediction method (FE-XGBoost). Specifically, we first combine the gradient boosting decision tree and recursive feature elimination method to extract essential features from meteorological data. Then, we incorporate a piecewise linear chaotic map to enhance the optimization accuracy of the sparrow search algorithm. Finally, we construct an XGBoost-based model for the classification prediction of power grid meteorological faults and optimize the hyperparameters such as the optimal tree depth, optimal learning rate, and optimal number of iterations using an enhanced sparrow search algorithm. Experimental results demonstrate that our method outperforms the baseline models in predicting power grid faults accurately.
文摘In this investigation,the Gradient Boosting(GB),Linear Regression(LR),Decision Tree(DT),and Voting algo-rithms were applied to predict the distribution pattern of Au geochemical data.Trace and indicator elements,including Mo,Cu,Pb,Zn,Ag,Ni,Co,Mn,Fe,and As,were used with these machine learning algorithms(MLAs)to predict Au concentration values in the Doostbigloo porphyry Cu-Au-Mo mineralization area.The performance of the models was evaluated using the Mean Absolute Percentage Error(MAPE)and Root Mean Square Error(RMSE)metrics.The proposed ensemble Voting algorithm outperformed the other models,yielding more ac-curate predictions according to both metrics.The predicted data from the GB,LR,DT,and Voting MLAs were modeled using the Concentration-Area fractal method,and Au geochemical anomalies were mapped.To compare and validate the results,factors such as the location of the mineral deposits,their surface extent,and mineralization trend were considered.The results indicate that integrating hybrid MLAs with fractal modeling signifi-cantly improves geochemical prospectivity mapping.Among the four models,three(DT,GB,Voting)accurately identified both mineral deposits.The LR model,however,only identified Deposit I(central),and its mineralization trend diverged from the field data.The GB and Voting models produced similar results,with their final maps derived from fractal modeling showing the same anomalous areas.The anomaly boundaries identified by these two models are consistent with the two known reserves in the region.The results and plots related to prediction indicators and error rates for these two models also show high similarity,with lower error rates than the other models.Notably,the Voting model demonstrated superior performance in accurately delineating mineral deposit locations and identifying realistic mineralization trends while minimizing false anomalies.