Accurate Electric Load Forecasting(ELF)is crucial for optimizing production capacity,improving operational efficiency,and managing energy resources effectively.Moreover,precise ELF contributes to a smaller environment...Accurate Electric Load Forecasting(ELF)is crucial for optimizing production capacity,improving operational efficiency,and managing energy resources effectively.Moreover,precise ELF contributes to a smaller environmental footprint by reducing the risks of disruption,downtime,and waste.However,with increasingly complex energy consumption patterns driven by renewable energy integration and changing consumer behaviors,no single approach has emerged as universally effective.In response,this research presents a hybrid modeling framework that combines the strengths of Random Forest(RF)and Autoregressive Integrated Moving Average(ARIMA)models,enhanced with advanced feature selection—Minimum Redundancy Maximum Relevancy and Maximum Synergy(MRMRMS)method—to produce a sparse model.Additionally,the residual patterns are analyzed to enhance forecast accuracy.High-resolution weather data from Weather Underground and historical energy consumption data from PJM for Duke Energy Ohio and Kentucky(DEO&K)are used in this application.This methodology,termed SP-RF-ARIMA,is evaluated against existing approaches;it demonstrates more than 40%reduction in mean absolute error and root mean square error compared to the second-best method.展开更多
Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support v...Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support vector machine(SVM),as well as ensemble methods,such as Gradient Boosting and eXtreme gradient boosting(XGBoost),are often plagued by high computational costs,which makes it challenging for them to perform real-time detection.In this regard,we suggested an attack detection approach that integrates Visual Geometry Group 16(VGG16),Artificial Rabbits Optimizer(ARO),and Random Forest Model to increase detection accuracy and operational efficiency in Internet of Things(IoT)networks.In the suggested model,the extraction of features from malware pictures was accomplished with the help of VGG16.The prediction process is carried out by the random forest model using the extracted features from the VGG16.Additionally,ARO is used to improve the hyper-parameters of the random forest model of the random forest.With an accuracy of 96.36%,the suggested model outperforms the standard models in terms of accuracy,F1-score,precision,and recall.The comparative research highlights our strategy’s success,which improves performance while maintaining a lower computational cost.This method is ideal for real-time applications,but it is effective.展开更多
Zenith wet delay(ZWD)is a key parameter for the precise positioning of global navigation satellite systems(GNSS)and occupies a central role in meteorological research.Currently,most models only consider the periodic v...Zenith wet delay(ZWD)is a key parameter for the precise positioning of global navigation satellite systems(GNSS)and occupies a central role in meteorological research.Currently,most models only consider the periodic variability of the ZWD,neglecting the effect of nonlinear factors on the ZWD estimation.This oversight results in a limited capability to reflect the rapid fluctuations of the ZWD.To more accurately capture and predict complicated variations in ZWD,this paper developed the CRZWD model by a combination of the GPT3 model and random forests(RF)algorithm using 5-year atmospheric profiles from 70 radiosonde(RS)stations across China.Taking the external 25 test stations data as reference,the root mean square(RMS)of the CRZWD model is 29.95 mm.Compared with the GPT3 model and another model using backpropagation neural network(BPNN),the accuracy has improved by 24.7%and 15.9%,respectively.Notably,over 56%of the test stations exhibit an improvement of more than 20%in contrast to GPT3-ZWD.Further temporal and spatial characteristic analyses also demonstrate the significant accuracy and stability advantages of the CRZWD model,indicating the potential prospects for GNSS-based applications.展开更多
To enhance the prediction accuracy of landslides in in Longyan City,China,this study developed a methodology for geologic hazard susceptibility assessment based on a coupled model composed of a Geographic Information ...To enhance the prediction accuracy of landslides in in Longyan City,China,this study developed a methodology for geologic hazard susceptibility assessment based on a coupled model composed of a Geographic Information System(GIS)with integrated spatial data,a frequency ratio(FR)model,and a random forest(RF)model(also referred to as the coupled FR-RF model).The coupled FR-RF model was constructed based on the analysis of nine influential factors,including distance from roads,normalized difference vegetation index(NDVI),and slope.The performance of the coupled FR-RF model was assessed using metrics such as Receiver Operating Characteristic(ROC)and Precision-Recall(PR)curves,yielding Area Under the Curve(AUC)values of 0.93 and 0.95,which indicate high predictive accuracy and reliability for geological hazard forecasting.Based on the model predictions,five susceptibility levels were determined in the study area,providing crucial spatial information for geologic hazard prevention and control.The contributions of various influential factors to landslide susceptibility were determined using SHapley Additive exPlanations(SHAP)analysis and the Gini index,enhancing the model interpretability and transparency.Additionally,this study discussed the limitations of the coupled FR-RF model and the prospects for its improvement using new technologies.This study provides an innovative method and theoretical support for geologic hazard prediction and management,holding promising prospects for application.展开更多
This paper presents new trading models for the stock market and test whether they are able to consistently generate excess returns from the Singapore Exchange (SGX). Instead of conventional ways of modeling stock pric...This paper presents new trading models for the stock market and test whether they are able to consistently generate excess returns from the Singapore Exchange (SGX). Instead of conventional ways of modeling stock prices, we construct models which relate the market indicators to a trading decision directly. Furthermore, unlike a reversal trading system or a binary system of buy and sell, we allow three modes of trades, namely, buy, sell or stand by, and the stand-by case is important as it caters to the market conditions where a model does not produce a strong signal of buy or sell. Linear trading models are firstly developed with the scoring technique which weights higher on successful indicators, as well as with the Least Squares technique which tries to match the past perfect trades with its weights. The linear models are then made adaptive by using the forgetting factor to address market changes. Because stock markets could be highly nonlinear sometimes, the Random Forest is adopted as a nonlinear trading model, and improved with Gradient Boosting to form a new technique—Gradient Boosted Random Forest. All the models are trained and evaluated on nine stocks and one index, and statistical tests such as randomness, linear and nonlinear correlations are conducted on the data to check the statistical significance of the inputs and their relation with the output before a model is trained. Our empirical results show that the proposed trading methods are able to generate excess returns compared with the buy-and-hold strategy.展开更多
This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting de...This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting decision tree(GBDT), random forest(RF) and information value(InV) models, were used, and the performances were assessed and compared. In total, 202 landslides were mapped by using a series of field surveys, aerial photographs, and reviews of historical and bibliographical data. Nine causative factors were then considered in landslide susceptibility map generation by using the GBDT, RF and InV models. All of the maps of the causative factors were resampled to a resolution of 28.5 m. Of the 486289 pixels in the area,28526 pixels were landslide pixels, and 457763 pixels were non-landslide pixels. Finally, landslide susceptibility maps were generated by using the three machine learning models, and their performances were assessed through receiver operating characteristic(ROC) curves, the sensitivity, specificity,overall accuracy(OA), and kappa coefficient(KAPPA). The results showed that the GBDT, RF and In V models in overall produced reasonable accurate landslide susceptibility maps. Among these three methods, the GBDT method outperforms the other two machine learning methods, which can provide strong technical support for producing landslide susceptibility maps in TGR.展开更多
Height–diameter relationships are essential elements of forest assessment and modeling efforts.In this work,two linear and eighteen nonlinear height–diameter equations were evaluated to find a local model for Orient...Height–diameter relationships are essential elements of forest assessment and modeling efforts.In this work,two linear and eighteen nonlinear height–diameter equations were evaluated to find a local model for Oriental beech(Fagus orientalis Lipsky) in the Hyrcanian Forest in Iran.The predictive performance of these models was first assessed by different evaluation criteria: adjusted R^2(R^2_(adj)),root mean square error(RMSE),relative RMSE(%RMSE),bias,and relative bias(%bias) criteria.The best model was selected for use as the base mixed-effects model.Random parameters for test plots were estimated with different tree selection options.Results show that the Chapman–Richards model had better predictive ability in terms of adj R^2(0.81),RMSE(3.7 m),%RMSE(12.9),bias(0.8),%Bias(2.79) than the other models.Furthermore,the calibration response,based on a selection of four trees from the sample plots,resulted in a reduction percentage for bias and RMSE of about 1.6–2.7%.Our results indicate that the calibrated model produced the most accurate results.展开更多
After the excavation of the roadway,the original stress balance is destroyed,resulting in the redistribution of stress and the formation of an excavation damaged zone(EDZ)around the roadway.The thickness of EDZ is the...After the excavation of the roadway,the original stress balance is destroyed,resulting in the redistribution of stress and the formation of an excavation damaged zone(EDZ)around the roadway.The thickness of EDZ is the key basis for roadway stability discrimination and support structure design,and it is of great engineering significance to accurately predict the thickness of EDZ.Considering the advantages of machine learning(ML)in dealing with high-dimensional,nonlinear problems,a hybrid prediction model based on the random forest(RF)algorithm is developed in this paper.The model used the dragonfly algorithm(DA)to optimize two hyperparameters in RF,namely mtry and ntree,and used mean absolute error(MAE),rootmean square error(RMSE),determination coefficient(R^(2)),and variance accounted for(VAF)to evaluatemodel prediction performance.A database containing 217 sets of data was collected,with embedding depth(ED),drift span(DS),surrounding rock mass strength(RMS),joint index(JI)as input variables,and the excavation damaged zone thickness(EDZT)as output variable.In addition,four classic models,back propagation neural network(BPNN),extreme learning machine(ELM),radial basis function network(RBF),and RF were compared with the DA-RF model.The results showed that the DARF mold had the best prediction performance(training set:MAE=0.1036,RMSE=0.1514,R^(2)=0.9577,VAF=94.2645;test set:MAE=0.1115,RMSE=0.1417,R^(2)=0.9423,VAF=94.0836).The results of the sensitivity analysis showed that the relative importance of each input variable was DS,ED,RMS,and JI from low to high.展开更多
Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of meth...Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of methods,but most of these methods only use the time domain information of traffic flow data to predict the traffic flow,ignoring the impact of spatial correlation on the prediction of target road segment flow,which leads to poor prediction accuracy.In this paper,a traffic flow prediction model called as long short time memory and random forest(LSTM-RF)was proposed based on the combination model.In the process of traffic flow prediction,the long short time memory(LSTM)model was used to extract the time sequence features of the predicted target road segment.Then,the predicted value of LSTM and the collected information of adjacent upstream and downstream sections were simultaneously used as the input features of the random forest model to analyze the spatial-temporal correlation of traffic flow,so as to obtain the final prediction results.The traffic flow data of 132 urban road sections collected by the license plate recognition system in Guiyang City were tested and verified.The results show that the method is better than the single model in prediction accuracy,and the prediction error is obviously reduced compared with the single model.展开更多
Given the rapid urbanization worldwide, Urban Heat Island(UHI) effect has been a severe issue limiting urban sustainability in both large and small cities. In order to study the spatial pattern of Surface urban heat i...Given the rapid urbanization worldwide, Urban Heat Island(UHI) effect has been a severe issue limiting urban sustainability in both large and small cities. In order to study the spatial pattern of Surface urban heat island(SUHI) in China’s Meihekou City, a combination method of Monte Carlo and Random Forest Regression(MC-RFR) is developed to construct the relationship between landscape pattern indices and Land Surface Temperature(LST). In this method, Monte Carlo acceptance-rejection sampling was added to the bootstrap layer of RFR to ensure the sensitivity of RFR to outliners of SUHI effect. The SHUI in 2030 was predicted by using this MC-RFR and the modeled future landscape pattern by Cellular Automata and Markov combination model(CA-Markov). Results reveal that forestland can greatly alleviate the impact of SUHI effect, while reasonable construction of urban land can also slow down the rising trend of SUHI. MC-RFR performs better for characterizing the relationship between landscape pattern and LST than single RFR or Linear Regression model. By 2030, the overall SUHI effect of Meihekou will be greatly enhanced, and the center of urban development will gradually shift to the central and western regions of the city. We suggest that urban designer and managers should concentrate vegetation and disperse built-up land to weaken the SUHI in the construction of new urban areas for its sustainability.展开更多
Potential of the Random Forest Model on mapping of different desertification processes was studied in Muttuma watershed of mid-Murrumbidgee river region of New South Wales,Australia.Desertification vulnerability index...Potential of the Random Forest Model on mapping of different desertification processes was studied in Muttuma watershed of mid-Murrumbidgee river region of New South Wales,Australia.Desertification vulnerability index was developed using climate,terrain,vegetation,soil and land quality indices to identify environmentally sensitive areas for desertification.Random Forest Model(RFM)was used to predict the different desertification processes such as soil erosion,salinization and waterlogging in the watershed and the information needed to train classification algorithms was obtained from satellite imagery interpretation and ground truth data.Climatic factors(evaporation,rainfall,temperature),terrain factors(aspect,slope,slope length,steepness,and wetness index),soil properties(pH,organic carbon,clay and sand content)and vulnerability indices were used as an explanatory variable.Classification accuracy and kappa index were calculated for training and testing datasets.We recorded an overall accuracy rate of 87.7%and 72.1%for training and testing sites,respectively.We found larger discrepancies between overall accuracy rate and kappa index for testing datasets(72.2%and 27.5%,respectively)suggesting that all the classes are not predicted well.The prediction of soil erosion and no desertification process was good and poor for salinization and water-logging process.Overall,the results observed give a new idea of using the knowledge of desertification process in training areas that can be used to predict the desertification processes at unvisited areas.展开更多
Modeling the spatial distribution of soil heavy metals is important in determining the safety of contaminated soils for agricultural use. This study utilized 60 topsoil samples (0 - 30 cm), multispectral images (Senti...Modeling the spatial distribution of soil heavy metals is important in determining the safety of contaminated soils for agricultural use. This study utilized 60 topsoil samples (0 - 30 cm), multispectral images (Sentinel-2), spectral indices, and ancillary data to model the spatial distribution of heavy metals in the soils along the Nairobi River. The model was generated using the Random Forest package in R. Using R2 to assess the prediction accuracy, the Random Forest model generated satisfactory results for all the elements. It also ranked the variables in order of their importance in the overall prediction. Spectral indices were the most important variables within the rankings. From the predicted topsoil maps, there were high concentrations of Cadmium on the easterly end of the river. Cadmium is an impurity in detergents, and this section is in close proximity to the Nairobi water sewerage plant, which could be a direct source of Cadmium. Some farms had Zinc levels which were above the World Health Organization recommended limit. The Random Forest model performed satisfactorily. However, the predictions can be improved further if the spatial resolutions of the various variables are increased and through the addition of more predictor variables.展开更多
基金supported by the Startup Grant(PG18929)awarded to F.Shokoohi.
文摘Accurate Electric Load Forecasting(ELF)is crucial for optimizing production capacity,improving operational efficiency,and managing energy resources effectively.Moreover,precise ELF contributes to a smaller environmental footprint by reducing the risks of disruption,downtime,and waste.However,with increasingly complex energy consumption patterns driven by renewable energy integration and changing consumer behaviors,no single approach has emerged as universally effective.In response,this research presents a hybrid modeling framework that combines the strengths of Random Forest(RF)and Autoregressive Integrated Moving Average(ARIMA)models,enhanced with advanced feature selection—Minimum Redundancy Maximum Relevancy and Maximum Synergy(MRMRMS)method—to produce a sparse model.Additionally,the residual patterns are analyzed to enhance forecast accuracy.High-resolution weather data from Weather Underground and historical energy consumption data from PJM for Duke Energy Ohio and Kentucky(DEO&K)are used in this application.This methodology,termed SP-RF-ARIMA,is evaluated against existing approaches;it demonstrates more than 40%reduction in mean absolute error and root mean square error compared to the second-best method.
基金funded by Institutional Fund Projects under grant no.(IFPDP-261-22)。
文摘Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support vector machine(SVM),as well as ensemble methods,such as Gradient Boosting and eXtreme gradient boosting(XGBoost),are often plagued by high computational costs,which makes it challenging for them to perform real-time detection.In this regard,we suggested an attack detection approach that integrates Visual Geometry Group 16(VGG16),Artificial Rabbits Optimizer(ARO),and Random Forest Model to increase detection accuracy and operational efficiency in Internet of Things(IoT)networks.In the suggested model,the extraction of features from malware pictures was accomplished with the help of VGG16.The prediction process is carried out by the random forest model using the extracted features from the VGG16.Additionally,ARO is used to improve the hyper-parameters of the random forest model of the random forest.With an accuracy of 96.36%,the suggested model outperforms the standard models in terms of accuracy,F1-score,precision,and recall.The comparative research highlights our strategy’s success,which improves performance while maintaining a lower computational cost.This method is ideal for real-time applications,but it is effective.
基金supported by the National Natural Science Foundation of China[42030109,42074012]the Scientific Study Project for institutes of Higher Learning,Ministry of Education,Liaoning Province[LJKMZ20220673]+2 种基金the Project supported by the State Key Laboratory of Geodesy and Earths'Dynamics,Innovation Academy for Precision Measurement Science and Technology[SKLGED2023-3-2]Liaoning Revitalization Talent Program[XLYC2203162]Natural Science Foundation of Hebei Province in China[D2023402024].
文摘Zenith wet delay(ZWD)is a key parameter for the precise positioning of global navigation satellite systems(GNSS)and occupies a central role in meteorological research.Currently,most models only consider the periodic variability of the ZWD,neglecting the effect of nonlinear factors on the ZWD estimation.This oversight results in a limited capability to reflect the rapid fluctuations of the ZWD.To more accurately capture and predict complicated variations in ZWD,this paper developed the CRZWD model by a combination of the GPT3 model and random forests(RF)algorithm using 5-year atmospheric profiles from 70 radiosonde(RS)stations across China.Taking the external 25 test stations data as reference,the root mean square(RMS)of the CRZWD model is 29.95 mm.Compared with the GPT3 model and another model using backpropagation neural network(BPNN),the accuracy has improved by 24.7%and 15.9%,respectively.Notably,over 56%of the test stations exhibit an improvement of more than 20%in contrast to GPT3-ZWD.Further temporal and spatial characteristic analyses also demonstrate the significant accuracy and stability advantages of the CRZWD model,indicating the potential prospects for GNSS-based applications.
基金supported by the project of the China Geological Survey(DD20230591).
文摘To enhance the prediction accuracy of landslides in in Longyan City,China,this study developed a methodology for geologic hazard susceptibility assessment based on a coupled model composed of a Geographic Information System(GIS)with integrated spatial data,a frequency ratio(FR)model,and a random forest(RF)model(also referred to as the coupled FR-RF model).The coupled FR-RF model was constructed based on the analysis of nine influential factors,including distance from roads,normalized difference vegetation index(NDVI),and slope.The performance of the coupled FR-RF model was assessed using metrics such as Receiver Operating Characteristic(ROC)and Precision-Recall(PR)curves,yielding Area Under the Curve(AUC)values of 0.93 and 0.95,which indicate high predictive accuracy and reliability for geological hazard forecasting.Based on the model predictions,five susceptibility levels were determined in the study area,providing crucial spatial information for geologic hazard prevention and control.The contributions of various influential factors to landslide susceptibility were determined using SHapley Additive exPlanations(SHAP)analysis and the Gini index,enhancing the model interpretability and transparency.Additionally,this study discussed the limitations of the coupled FR-RF model and the prospects for its improvement using new technologies.This study provides an innovative method and theoretical support for geologic hazard prediction and management,holding promising prospects for application.
文摘This paper presents new trading models for the stock market and test whether they are able to consistently generate excess returns from the Singapore Exchange (SGX). Instead of conventional ways of modeling stock prices, we construct models which relate the market indicators to a trading decision directly. Furthermore, unlike a reversal trading system or a binary system of buy and sell, we allow three modes of trades, namely, buy, sell or stand by, and the stand-by case is important as it caters to the market conditions where a model does not produce a strong signal of buy or sell. Linear trading models are firstly developed with the scoring technique which weights higher on successful indicators, as well as with the Least Squares technique which tries to match the past perfect trades with its weights. The linear models are then made adaptive by using the forgetting factor to address market changes. Because stock markets could be highly nonlinear sometimes, the Random Forest is adopted as a nonlinear trading model, and improved with Gradient Boosting to form a new technique—Gradient Boosted Random Forest. All the models are trained and evaluated on nine stocks and one index, and statistical tests such as randomness, linear and nonlinear correlations are conducted on the data to check the statistical significance of the inputs and their relation with the output before a model is trained. Our empirical results show that the proposed trading methods are able to generate excess returns compared with the buy-and-hold strategy.
基金This work was supported in part by the National Natural Science Foundation of China(61601418,41602362,61871259)in part by the Opening Foundation of Hunan Engineering and Research Center of Natural Resource Investigation and Monitoring(2020-5)+1 种基金in part by the Qilian Mountain National Park Research Center(Qinghai)(grant number:GKQ2019-01)in part by the Geomatics Technology and Application Key Laboratory of Qinghai Province,Grant No.QHDX-2019-01.
文摘This work was to generate landslide susceptibility maps for the Three Gorges Reservoir(TGR) area, China by using different machine learning models. Three advanced machine learning methods, namely, gradient boosting decision tree(GBDT), random forest(RF) and information value(InV) models, were used, and the performances were assessed and compared. In total, 202 landslides were mapped by using a series of field surveys, aerial photographs, and reviews of historical and bibliographical data. Nine causative factors were then considered in landslide susceptibility map generation by using the GBDT, RF and InV models. All of the maps of the causative factors were resampled to a resolution of 28.5 m. Of the 486289 pixels in the area,28526 pixels were landslide pixels, and 457763 pixels were non-landslide pixels. Finally, landslide susceptibility maps were generated by using the three machine learning models, and their performances were assessed through receiver operating characteristic(ROC) curves, the sensitivity, specificity,overall accuracy(OA), and kappa coefficient(KAPPA). The results showed that the GBDT, RF and In V models in overall produced reasonable accurate landslide susceptibility maps. Among these three methods, the GBDT method outperforms the other two machine learning methods, which can provide strong technical support for producing landslide susceptibility maps in TGR.
基金This research received no specific grant from any funding agency in the public,commercial,or not-for-profit sectors
文摘Height–diameter relationships are essential elements of forest assessment and modeling efforts.In this work,two linear and eighteen nonlinear height–diameter equations were evaluated to find a local model for Oriental beech(Fagus orientalis Lipsky) in the Hyrcanian Forest in Iran.The predictive performance of these models was first assessed by different evaluation criteria: adjusted R^2(R^2_(adj)),root mean square error(RMSE),relative RMSE(%RMSE),bias,and relative bias(%bias) criteria.The best model was selected for use as the base mixed-effects model.Random parameters for test plots were estimated with different tree selection options.Results show that the Chapman–Richards model had better predictive ability in terms of adj R^2(0.81),RMSE(3.7 m),%RMSE(12.9),bias(0.8),%Bias(2.79) than the other models.Furthermore,the calibration response,based on a selection of four trees from the sample plots,resulted in a reduction percentage for bias and RMSE of about 1.6–2.7%.Our results indicate that the calibrated model produced the most accurate results.
基金funded by the National Science Foundation of China(42177164)the Distinguished Youth Science Foundation of Hunan Province of China(2022JJ10073)the Innovation-Driven Project of Central South University(2020CX040).
文摘After the excavation of the roadway,the original stress balance is destroyed,resulting in the redistribution of stress and the formation of an excavation damaged zone(EDZ)around the roadway.The thickness of EDZ is the key basis for roadway stability discrimination and support structure design,and it is of great engineering significance to accurately predict the thickness of EDZ.Considering the advantages of machine learning(ML)in dealing with high-dimensional,nonlinear problems,a hybrid prediction model based on the random forest(RF)algorithm is developed in this paper.The model used the dragonfly algorithm(DA)to optimize two hyperparameters in RF,namely mtry and ntree,and used mean absolute error(MAE),rootmean square error(RMSE),determination coefficient(R^(2)),and variance accounted for(VAF)to evaluatemodel prediction performance.A database containing 217 sets of data was collected,with embedding depth(ED),drift span(DS),surrounding rock mass strength(RMS),joint index(JI)as input variables,and the excavation damaged zone thickness(EDZT)as output variable.In addition,four classic models,back propagation neural network(BPNN),extreme learning machine(ELM),radial basis function network(RBF),and RF were compared with the DA-RF model.The results showed that the DARF mold had the best prediction performance(training set:MAE=0.1036,RMSE=0.1514,R^(2)=0.9577,VAF=94.2645;test set:MAE=0.1115,RMSE=0.1417,R^(2)=0.9423,VAF=94.0836).The results of the sensitivity analysis showed that the relative importance of each input variable was DS,ED,RMS,and JI from low to high.
文摘Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of methods,but most of these methods only use the time domain information of traffic flow data to predict the traffic flow,ignoring the impact of spatial correlation on the prediction of target road segment flow,which leads to poor prediction accuracy.In this paper,a traffic flow prediction model called as long short time memory and random forest(LSTM-RF)was proposed based on the combination model.In the process of traffic flow prediction,the long short time memory(LSTM)model was used to extract the time sequence features of the predicted target road segment.Then,the predicted value of LSTM and the collected information of adjacent upstream and downstream sections were simultaneously used as the input features of the random forest model to analyze the spatial-temporal correlation of traffic flow,so as to obtain the final prediction results.The traffic flow data of 132 urban road sections collected by the license plate recognition system in Guiyang City were tested and verified.The results show that the method is better than the single model in prediction accuracy,and the prediction error is obviously reduced compared with the single model.
基金Under the auspices of National Natural Science Foundation of China(No.41977411,41771383)Technology Research Project of the Education Department of Jilin Province(No.JJKH20210445KJ)。
文摘Given the rapid urbanization worldwide, Urban Heat Island(UHI) effect has been a severe issue limiting urban sustainability in both large and small cities. In order to study the spatial pattern of Surface urban heat island(SUHI) in China’s Meihekou City, a combination method of Monte Carlo and Random Forest Regression(MC-RFR) is developed to construct the relationship between landscape pattern indices and Land Surface Temperature(LST). In this method, Monte Carlo acceptance-rejection sampling was added to the bootstrap layer of RFR to ensure the sensitivity of RFR to outliners of SUHI effect. The SHUI in 2030 was predicted by using this MC-RFR and the modeled future landscape pattern by Cellular Automata and Markov combination model(CA-Markov). Results reveal that forestland can greatly alleviate the impact of SUHI effect, while reasonable construction of urban land can also slow down the rising trend of SUHI. MC-RFR performs better for characterizing the relationship between landscape pattern and LST than single RFR or Linear Regression model. By 2030, the overall SUHI effect of Meihekou will be greatly enhanced, and the center of urban development will gradually shift to the central and western regions of the city. We suggest that urban designer and managers should concentrate vegetation and disperse built-up land to weaken the SUHI in the construction of new urban areas for its sustainability.
文摘Potential of the Random Forest Model on mapping of different desertification processes was studied in Muttuma watershed of mid-Murrumbidgee river region of New South Wales,Australia.Desertification vulnerability index was developed using climate,terrain,vegetation,soil and land quality indices to identify environmentally sensitive areas for desertification.Random Forest Model(RFM)was used to predict the different desertification processes such as soil erosion,salinization and waterlogging in the watershed and the information needed to train classification algorithms was obtained from satellite imagery interpretation and ground truth data.Climatic factors(evaporation,rainfall,temperature),terrain factors(aspect,slope,slope length,steepness,and wetness index),soil properties(pH,organic carbon,clay and sand content)and vulnerability indices were used as an explanatory variable.Classification accuracy and kappa index were calculated for training and testing datasets.We recorded an overall accuracy rate of 87.7%and 72.1%for training and testing sites,respectively.We found larger discrepancies between overall accuracy rate and kappa index for testing datasets(72.2%and 27.5%,respectively)suggesting that all the classes are not predicted well.The prediction of soil erosion and no desertification process was good and poor for salinization and water-logging process.Overall,the results observed give a new idea of using the knowledge of desertification process in training areas that can be used to predict the desertification processes at unvisited areas.
文摘Modeling the spatial distribution of soil heavy metals is important in determining the safety of contaminated soils for agricultural use. This study utilized 60 topsoil samples (0 - 30 cm), multispectral images (Sentinel-2), spectral indices, and ancillary data to model the spatial distribution of heavy metals in the soils along the Nairobi River. The model was generated using the Random Forest package in R. Using R2 to assess the prediction accuracy, the Random Forest model generated satisfactory results for all the elements. It also ranked the variables in order of their importance in the overall prediction. Spectral indices were the most important variables within the rankings. From the predicted topsoil maps, there were high concentrations of Cadmium on the easterly end of the river. Cadmium is an impurity in detergents, and this section is in close proximity to the Nairobi water sewerage plant, which could be a direct source of Cadmium. Some farms had Zinc levels which were above the World Health Organization recommended limit. The Random Forest model performed satisfactorily. However, the predictions can be improved further if the spatial resolutions of the various variables are increased and through the addition of more predictor variables.