This study investigated forest recovery in the Atlantic Rainforest and Rupestrian Grassland of Brazil using the diffusive-logistic growth(DLG)model.This model simulates vegetation growth in the two mountain biomes con...This study investigated forest recovery in the Atlantic Rainforest and Rupestrian Grassland of Brazil using the diffusive-logistic growth(DLG)model.This model simulates vegetation growth in the two mountain biomes considering spatial location,time,and two key parameters:diffusion rate and growth rate.A Bayesian framework is employed to analyze the model's parameters and assess prediction uncertainties.Satellite imagery from 1992 and 2022 was used for model calibration and validation.By solving the DLG model using the finite difference method,we predicted a 6.6%–51.1%increase in vegetation density for the Atlantic Rainforest and a 5.3%–99.9%increase for the Rupestrian Grassland over 30 years,with the latter showing slower recovery but achieving a better model fit(lower RMSE)compared to the Atlantic Rainforest.The Bayesian approach revealed well-defined parameter distributions and lower parameter values for the Rupestrian Grassland,supporting the slower recovery prediction.Importantly,the model achieved good agreement with observed vegetation patterns in unseen validation data for both biomes.While there were minor spatial variations in accuracy,the overall distributions of predicted and observed vegetation density were comparable.Furthermore,this study highlights the importance of considering uncertainty in model predictions.Bayesian inference allowed us to quantify this uncertainty,demonstrating that the model's performance can vary across locations.Our approach provides valuable insights into forest regeneration process uncertainties,enabling comparisons of modeled scenarios at different recovery stages for better decision-making in these critical mountain biomes.展开更多
Accurate assessment of undrained shear strength(USS)for soft sensitive clays is a great concern in geotechnical engineering practice.This study applies novel data-driven extreme gradient boosting(XGBoost)and random fo...Accurate assessment of undrained shear strength(USS)for soft sensitive clays is a great concern in geotechnical engineering practice.This study applies novel data-driven extreme gradient boosting(XGBoost)and random forest(RF)ensemble learning methods for capturing the relationships between the USS and various basic soil parameters.Based on the soil data sets from TC304 database,a general approach is developed to predict the USS of soft clays using the two machine learning methods above,where five feature variables including the preconsolidation stress(PS),vertical effective stress(VES),liquid limit(LL),plastic limit(PL)and natural water content(W)are adopted.To reduce the dependence on the rule of thumb and inefficient brute-force search,the Bayesian optimization method is applied to determine the appropriate model hyper-parameters of both XGBoost and RF.The developed models are comprehensively compared with three comparison machine learning methods and two transformation models with respect to predictive accuracy and robustness under 5-fold cross-validation(CV).It is shown that XGBoost-based and RF-based methods outperform these approaches.Besides,the XGBoostbased model provides feature importance ranks,which makes it a promising tool in the prediction of geotechnical parameters and enhances the interpretability of model.展开更多
Driven piles are used in many geological environments as a practical and convenient structural component.Hence,the determination of the drivability of piles is actually of great importance in complex geotechnical appl...Driven piles are used in many geological environments as a practical and convenient structural component.Hence,the determination of the drivability of piles is actually of great importance in complex geotechnical applications.Conventional methods of predicting pile drivability often rely on simplified physicalmodels or empirical formulas,whichmay lack accuracy or applicability in complex geological conditions.Therefore,this study presents a practical machine learning approach,namely a Random Forest(RF)optimized by Bayesian Optimization(BO)and Particle Swarm Optimization(PSO),which not only enhances prediction accuracy but also better adapts to varying geological environments to predict the drivability parameters of piles(i.e.,maximumcompressive stress,maximum tensile stress,and blow per foot).In addition,support vector regression,extreme gradient boosting,k nearest neighbor,and decision tree are also used and applied for comparison purposes.In order to train and test these models,among the 4072 datasets collected with 17model inputs,3258 datasets were randomly selected for training,and the remaining 814 datasets were used for model testing.Lastly,the results of these models were compared and evaluated using two performance indices,i.e.,the root mean square error(RMSE)and the coefficient of determination(R2).The results indicate that the optimized RF model achieved lower RMSE than other prediction models in predicting the three parameters,specifically 0.044,0.438,and 0.146;and higher R^(2) values than other implemented techniques,specifically 0.966,0.884,and 0.977.In addition,the sensitivity and uncertainty of the optimized RF model were analyzed using Sobol sensitivity analysis and Monte Carlo(MC)simulation.It can be concluded that the optimized RF model could be used to predict the performance of the pile,and it may provide a useful reference for solving some problems under similar engineering conditions.展开更多
As soil heavy metal pollution is increasing year by year,the risk assess-ment of soil heavy metal pollution is gradually gaining attention.Soil heavy metal datasets are usually imbalanced datasets in which most of the...As soil heavy metal pollution is increasing year by year,the risk assess-ment of soil heavy metal pollution is gradually gaining attention.Soil heavy metal datasets are usually imbalanced datasets in which most of the samples are safe samples that are not contaminated with heavy metals.Random Forest(RF)has strong generalization ability and is not easy to overfit.In this paper,we improve the Bagging algorithm and simple voting method of RF.AW-RF algorithm based on adaptive Bagging and weighted voting is proposed to improve the classifica-tion performance of RF on imbalanced datasets.Adaptive Bagging enables trees in RF to learn information from the positive samples,and weighted voting method enables trees with superior performance to have higher voting weights.Experi-ments were conducted using G-mean,recall and F1-score to set weights,and the results obtained were better than RF.Risk assessment experiments were conducted using W-RF on the heavy metal dataset from agricultural fields around Wuhan.The experimental results show that the RW-RF algorithm,which use recall to calculate the classifier weights,has the best classification performance.At the end of this paper,we optimized the hyperparameters of the RW-RF algorithm by a Bayesian optimization algorithm.We use G-mean as the objective function to obtain the opti-mal hyperparameter combination within the number of iterations.展开更多
This paper proposes a hybrid Bayesian Network(BN)method for short-term forecasting of crude oil prices.The method performed is a hybrid,based on both the aspects of classification of influencing factors as well as the...This paper proposes a hybrid Bayesian Network(BN)method for short-term forecasting of crude oil prices.The method performed is a hybrid,based on both the aspects of classification of influencing factors as well as the regression of the out-ofsample values.For the sake of performance comparison,several other hybrid methods have also been devised using the methods of Markov Chain Monte Carlo(MCMC),Random Forest(RF),Support Vector Machine(SVM),neural networks(NNET)and generalized autoregressive conditional heteroskedasticity(GARCH).The hybrid methodology is primarily reliant upon constructing the crude oil price forecast from the summation of its Intrinsic Mode Functions(IMF)and its residue,extracted by an Empirical Mode Decomposition(EMD)of the original crude price signal.The Volatility Index(VIX)as well as the Implied Oil Volatility Index(OVX)has been considered among the influencing parameters of the crude price forecast.The final set of influencing parameters were selected as the whole set of significant contributors detected by the methods of Bayesian Network,Quantile Regression with Lasso penalty(QRL),Bayesian Lasso(BLasso)and the Bayesian Ridge Regression(BRR).The performance of the proposed hybrid-BN method is reported for the three crude price benchmarks:West Texas Intermediate,Brent Crude and the OPEC Reference Basket.展开更多
Forest fire accidents caused by distribution line faults occur frequently,resulting in heavy impacts on people’s safety and social and economic development.Currently,there are few risk assessments for forest fires in...Forest fire accidents caused by distribution line faults occur frequently,resulting in heavy impacts on people’s safety and social and economic development.Currently,there are few risk assessments for forest fires induced by over-head distribution lines,and existing assessment methods may have difficulties in data acquisition.On this basis,a novel as-sessment framework based on an analytic hierarchy process,a Bayesian network and a Fussel-Vesely importance metric is proposed in this paper.The framework combines field research and historical operation and maintenance data to assess the regional-scale risk of forest fires induced by overhead distribution lines to derive the probability of forest fires and to identify high-risk lines and key hazard events in the assessment region.Finally,taking the southern Anhui region as an ex-ample,the annual fire probability of forest fires induced by overhead distribution lines in the southern Anhui region is 5.88%,and rectification measures are proposed.This study provides management with a complete assessment framework that optimizes the difficulty of data collection and allows for additional targeted corrective measures to be proposed for the entire region and route on the basis of the assessment results.展开更多
基金financial support from the Brazilian National Council for Scientific and Technological Development(CNPq)and the Federal University of Ouro PretoFinancial support from the Minas Gerais Research Foundation(FAPEMIG)under grant number APQ-06559-24 is also gratefully acknowledged。
文摘This study investigated forest recovery in the Atlantic Rainforest and Rupestrian Grassland of Brazil using the diffusive-logistic growth(DLG)model.This model simulates vegetation growth in the two mountain biomes considering spatial location,time,and two key parameters:diffusion rate and growth rate.A Bayesian framework is employed to analyze the model's parameters and assess prediction uncertainties.Satellite imagery from 1992 and 2022 was used for model calibration and validation.By solving the DLG model using the finite difference method,we predicted a 6.6%–51.1%increase in vegetation density for the Atlantic Rainforest and a 5.3%–99.9%increase for the Rupestrian Grassland over 30 years,with the latter showing slower recovery but achieving a better model fit(lower RMSE)compared to the Atlantic Rainforest.The Bayesian approach revealed well-defined parameter distributions and lower parameter values for the Rupestrian Grassland,supporting the slower recovery prediction.Importantly,the model achieved good agreement with observed vegetation patterns in unseen validation data for both biomes.While there were minor spatial variations in accuracy,the overall distributions of predicted and observed vegetation density were comparable.Furthermore,this study highlights the importance of considering uncertainty in model predictions.Bayesian inference allowed us to quantify this uncertainty,demonstrating that the model's performance can vary across locations.Our approach provides valuable insights into forest regeneration process uncertainties,enabling comparisons of modeled scenarios at different recovery stages for better decision-making in these critical mountain biomes.
基金financial support from High-end Foreign Expert Introduction program(No.G20190022002)Chongqing Construction Science and Technology Plan Project(2019-0045)as well as Chongqing Engineering Research Center of Disaster Prevention&Control for Banks and Structures in Three Gorges Reservoir Area(Nos.SXAPGC18ZD01 and SXAPGC18YB03)。
文摘Accurate assessment of undrained shear strength(USS)for soft sensitive clays is a great concern in geotechnical engineering practice.This study applies novel data-driven extreme gradient boosting(XGBoost)and random forest(RF)ensemble learning methods for capturing the relationships between the USS and various basic soil parameters.Based on the soil data sets from TC304 database,a general approach is developed to predict the USS of soft clays using the two machine learning methods above,where five feature variables including the preconsolidation stress(PS),vertical effective stress(VES),liquid limit(LL),plastic limit(PL)and natural water content(W)are adopted.To reduce the dependence on the rule of thumb and inefficient brute-force search,the Bayesian optimization method is applied to determine the appropriate model hyper-parameters of both XGBoost and RF.The developed models are comprehensively compared with three comparison machine learning methods and two transformation models with respect to predictive accuracy and robustness under 5-fold cross-validation(CV).It is shown that XGBoost-based and RF-based methods outperform these approaches.Besides,the XGBoostbased model provides feature importance ranks,which makes it a promising tool in the prediction of geotechnical parameters and enhances the interpretability of model.
基金supported by the National Science Foundation of China(42107183).
文摘Driven piles are used in many geological environments as a practical and convenient structural component.Hence,the determination of the drivability of piles is actually of great importance in complex geotechnical applications.Conventional methods of predicting pile drivability often rely on simplified physicalmodels or empirical formulas,whichmay lack accuracy or applicability in complex geological conditions.Therefore,this study presents a practical machine learning approach,namely a Random Forest(RF)optimized by Bayesian Optimization(BO)and Particle Swarm Optimization(PSO),which not only enhances prediction accuracy but also better adapts to varying geological environments to predict the drivability parameters of piles(i.e.,maximumcompressive stress,maximum tensile stress,and blow per foot).In addition,support vector regression,extreme gradient boosting,k nearest neighbor,and decision tree are also used and applied for comparison purposes.In order to train and test these models,among the 4072 datasets collected with 17model inputs,3258 datasets were randomly selected for training,and the remaining 814 datasets were used for model testing.Lastly,the results of these models were compared and evaluated using two performance indices,i.e.,the root mean square error(RMSE)and the coefficient of determination(R2).The results indicate that the optimized RF model achieved lower RMSE than other prediction models in predicting the three parameters,specifically 0.044,0.438,and 0.146;and higher R^(2) values than other implemented techniques,specifically 0.966,0.884,and 0.977.In addition,the sensitivity and uncertainty of the optimized RF model were analyzed using Sobol sensitivity analysis and Monte Carlo(MC)simulation.It can be concluded that the optimized RF model could be used to predict the performance of the pile,and it may provide a useful reference for solving some problems under similar engineering conditions.
基金This work was supported in part by the Major Technical Innovation Projects of Hubei Province under Grant 2018ABA099in part by the National Science Fund for Youth of Hubei Province of China under Grant 2018CFB408+2 种基金in part by the Natural Science Foundation of Hubei Province of China under Grant 2015CFA061in part by the National Nature Science Foundation of China under Grant 61272278in part by Research on Key Technologies of Intelligent Decision-making for Food Big Data under Grant 2018A01038.
文摘As soil heavy metal pollution is increasing year by year,the risk assess-ment of soil heavy metal pollution is gradually gaining attention.Soil heavy metal datasets are usually imbalanced datasets in which most of the samples are safe samples that are not contaminated with heavy metals.Random Forest(RF)has strong generalization ability and is not easy to overfit.In this paper,we improve the Bagging algorithm and simple voting method of RF.AW-RF algorithm based on adaptive Bagging and weighted voting is proposed to improve the classifica-tion performance of RF on imbalanced datasets.Adaptive Bagging enables trees in RF to learn information from the positive samples,and weighted voting method enables trees with superior performance to have higher voting weights.Experi-ments were conducted using G-mean,recall and F1-score to set weights,and the results obtained were better than RF.Risk assessment experiments were conducted using W-RF on the heavy metal dataset from agricultural fields around Wuhan.The experimental results show that the RW-RF algorithm,which use recall to calculate the classifier weights,has the best classification performance.At the end of this paper,we optimized the hyperparameters of the RW-RF algorithm by a Bayesian optimization algorithm.We use G-mean as the objective function to obtain the opti-mal hyperparameter combination within the number of iterations.
文摘This paper proposes a hybrid Bayesian Network(BN)method for short-term forecasting of crude oil prices.The method performed is a hybrid,based on both the aspects of classification of influencing factors as well as the regression of the out-ofsample values.For the sake of performance comparison,several other hybrid methods have also been devised using the methods of Markov Chain Monte Carlo(MCMC),Random Forest(RF),Support Vector Machine(SVM),neural networks(NNET)and generalized autoregressive conditional heteroskedasticity(GARCH).The hybrid methodology is primarily reliant upon constructing the crude oil price forecast from the summation of its Intrinsic Mode Functions(IMF)and its residue,extracted by an Empirical Mode Decomposition(EMD)of the original crude price signal.The Volatility Index(VIX)as well as the Implied Oil Volatility Index(OVX)has been considered among the influencing parameters of the crude price forecast.The final set of influencing parameters were selected as the whole set of significant contributors detected by the methods of Bayesian Network,Quantile Regression with Lasso penalty(QRL),Bayesian Lasso(BLasso)and the Bayesian Ridge Regression(BRR).The performance of the proposed hybrid-BN method is reported for the three crude price benchmarks:West Texas Intermediate,Brent Crude and the OPEC Reference Basket.
基金This work was supported by the National Key Research and Development Program of China(2022YFC3003101)the Fundamental Research Funds for the Central Universities(WK2320000050)the Science and Technology Program of State Grid Anhui Electric Power Co.,Ltd.(521205220001).
文摘Forest fire accidents caused by distribution line faults occur frequently,resulting in heavy impacts on people’s safety and social and economic development.Currently,there are few risk assessments for forest fires induced by over-head distribution lines,and existing assessment methods may have difficulties in data acquisition.On this basis,a novel as-sessment framework based on an analytic hierarchy process,a Bayesian network and a Fussel-Vesely importance metric is proposed in this paper.The framework combines field research and historical operation and maintenance data to assess the regional-scale risk of forest fires induced by overhead distribution lines to derive the probability of forest fires and to identify high-risk lines and key hazard events in the assessment region.Finally,taking the southern Anhui region as an ex-ample,the annual fire probability of forest fires induced by overhead distribution lines in the southern Anhui region is 5.88%,and rectification measures are proposed.This study provides management with a complete assessment framework that optimizes the difficulty of data collection and allows for additional targeted corrective measures to be proposed for the entire region and route on the basis of the assessment results.
文摘为了改善小样本数据的过拟合问题,提高小麦条锈病遥感监测模型的泛化能力和预测精度,以2018年河北省中国农业科学院实验站获取的冠层日光诱导叶绿素荧光(Solar-Induced Chlorophyll Fluorescence,SIF)为数据源,利用代价复杂性剪枝(Cost-Complexity Pruning,CCP)算法对随机森林回归(Random Forest Regression,RFR)方法进行剪枝约束,并结合贝叶斯优化(Bayesian Optimiazation,BO)算法对随机森林回归进行超参数选取,构建了基于约束随机森林回归(Constrained Random Forest,CO-RFR)算法小麦条锈病严重度预测模型,并将其与分类回归树(Classification And Regression Tree,CART)算法、传统RFR算法以及多元线性回归(Multiple Linear Regression,MLR)方法构建的小麦条锈病遥感监测模型精度进行比较。结果表明:①CORFR模型的估测精度最高,更适合于小样本数据下的小麦条锈病遥感监测。其中,在验证数据集中CO-RFR模型预测病情严重度(Severity Level,SL)和实测SL间的平均RMSE比RFR、CART和MLR模型分别减少了43%、50%和40%,平均R^(2)分别提高了56%、47%和40%。②增加约束条件能够有效改善模型的过拟合现象,提高模型的泛化能力。其中,RFR模型训练集预测SL值和实测SL值间的平均RMSE较验证集减少了62%,表明模型训练集精度远高于验证集,模型出现过拟合,而CO-RFR模型训练集预测SL值和实测SL值间的平均RMSE较验证集减少了8%,表明模型拟合效果较好,过拟合现象得到明显改善。该研究对提高小样本数据下的小麦条锈病病情严重度的遥感预测精度具有重要意义,同时亦为其它作物的胁迫监测提供了应用参考。