Traditional 3Ni weathering steel cannot completely meet the requirements for offshore engineering development,resulting in the design of novel 3Ni steel with the addition of microalloy elements such as Mn or Nb for st...Traditional 3Ni weathering steel cannot completely meet the requirements for offshore engineering development,resulting in the design of novel 3Ni steel with the addition of microalloy elements such as Mn or Nb for strength enhancement becoming a trend.The stress-assisted corrosion behavior of a novel designed high-strength 3Ni steel was investigated in the current study using the corrosion big data method.The information on the corrosion process was recorded using the galvanic corrosion current monitoring method.The gradi-ent boosting decision tree(GBDT)machine learning method was used to mine the corrosion mechanism,and the importance of the struc-ture factor was investigated.Field exposure tests were conducted to verify the calculated results using the GBDT method.Results indic-ated that the GBDT method can be effectively used to study the influence of structural factors on the corrosion process of 3Ni steel.Dif-ferent mechanisms for the addition of Mn and Cu to the stress-assisted corrosion of 3Ni steel suggested that Mn and Cu have no obvious effect on the corrosion rate of non-stressed 3Ni steel during the early stage of corrosion.When the corrosion reached a stable state,the in-crease in Mn element content increased the corrosion rate of 3Ni steel,while Cu reduced this rate.In the presence of stress,the increase in Mn element content and Cu addition can inhibit the corrosion process.The corrosion law of outdoor-exposed 3Ni steel is consistent with the law based on corrosion big data technology,verifying the reliability of the big data evaluation method and data prediction model selection.展开更多
To investigate the travel time prediction method of the freeway, a model based on the gradient boosting decision tree (GBDT) is proposed. Eleven variables (namely, travel time in current period T i , traffic flow in c...To investigate the travel time prediction method of the freeway, a model based on the gradient boosting decision tree (GBDT) is proposed. Eleven variables (namely, travel time in current period T i , traffic flow in current period Q i , speed in current period V i , density in current period K i , the number of vehicles in current period N i , occupancy in current period R i , traffic state parameter in current period X i , travel time in previous time period T i -1 , etc.) are selected to predict the travel time for 10 min ahead in the proposed model. Data obtained from VISSIM simulation is used to train and test the model. The results demonstrate that the prediction error of the GBDT model is smaller than those of the back propagation (BP) neural network model and the support vector machine (SVM) model. Travel time in current period T i is the most important variable among all variables in the GBDT model. The GBDT model can produce more accurate prediction results and mine the hidden nonlinear relationships deeply between variables and the predicted travel time.展开更多
Accurate prediction ofmonthly oil and gas production is essential for oil enterprises tomake reasonable production plans,avoid blind investment and realize sustainable development.Traditional oil well production trend...Accurate prediction ofmonthly oil and gas production is essential for oil enterprises tomake reasonable production plans,avoid blind investment and realize sustainable development.Traditional oil well production trend prediction methods are based on years of oil field production experience and expertise,and the application conditions are very demanding.With the rapid development of artificial intelligence technology,big data analysis methods are gradually applied in various sub-fields of the oil and gas reservoir development.Based on the data-driven artificial intelligence algorithmGradient BoostingDecision Tree(GBDT),this paper predicts the initial single-layer production by considering geological data,fluid PVT data and well data.The results show that the GBDT algorithm prediction model has great accuracy,significantly improving efficiency and strong universal applicability.The GBDTmethod trained in this paper can predict production,which is helpful for well site optimization,perforation layer optimization and engineering parameter optimization and has guiding significance for oilfield development.展开更多
The stability of underground entry-type excavations will directly affect the working environment and the safety of staff.Empirical critical span graphs and traditional statistics learning methods can not meet the requ...The stability of underground entry-type excavations will directly affect the working environment and the safety of staff.Empirical critical span graphs and traditional statistics learning methods can not meet the requirements of high accuracy for stability assessment of entry-type excavations.Therefore,this study proposes a new prediction method based on machine learning to scientifically adjust the critical span graph.Accordingly,the particle swarm optimization(PSO)algorithm is used to optimize the core parameters of the gradient boosting decision tree(GBDT),abbreviated as PSO-GBDT.Moreover,the classification performance of eight other classifiers including GDBT,k-nearest neighbors(KNN),two kinds of support vector machines(SVM),Gaussian naive Bayes(GNB),logistic regression(LR)and linear discriminant analysis(LDA)are also applied to compare with the proposed model.Findings revealed that compared with the other eight models,the prediction performance of PSO-GBDT is undoubtedly the most reliable,and its classification accuracy is up to 0.93.Therefore,this model has great potential to provide a more scientific and accurate choice for the stability prediction of underground excavations.In addition,each classification model is used to predict the stability category of several grid points divided by the critical span graph,and the updated critical span graph of each model is discussed in combination with previous studies.The results show that the PSO-GBDT model has the advantages of being scientific,accurate and efficient in updating the critical span graph,and its output decision boundary has strict theoretical support,which can help mine operators make favorable economic decisions.展开更多
In order to improve the accuracy of target intent recognition,a recognition method based on XGBoost(eXtreme Gradient Boosting)decision tree is proposed.This paper adopts relevant data and program of python to calculat...In order to improve the accuracy of target intent recognition,a recognition method based on XGBoost(eXtreme Gradient Boosting)decision tree is proposed.This paper adopts relevant data and program of python to calculate the probability of tactical intention.Then the sequence intention probability is obtained by applying Dempster-Shafer rule of combination.To verify the accuracy of recognition results,we compare the experimental results of this paper with the results in the literatures.The experiment shows that the probability of tactical intention recognition through this method is improved,so this method is feasible.展开更多
The effects of the built environment factors on urban vitality have attracted wide attention in the urban planning fields in recent years,but few studies have considered the variables’relative importance and their no...The effects of the built environment factors on urban vitality have attracted wide attention in the urban planning fields in recent years,but few studies have considered the variables’relative importance and their nonlinear effects on urban vitality.Taking a Chinese metropolis—Hangzhou as a case study,this study applied the gradient boosting decision tree(GBDT)model to explore the nonlinear effects of the 5D factors of the urban built environment on urban social vitality and economic vitality and the importance of variables.The results show that the GBDT model has better goodness of fit than the traditional ordinary least squares(OLS)regression in the urban vitality models.The urban built environment plays an important role in affecting urban vitality,while built environment designs witness the most important effect.Specifically,the density of shopping facilities,medical facilities,and road networks are the most important factors affecting urban social vitality,while road network density,destination accessibility,and population density play the most important roles in affecting urban economic vitality.Finally,the urban built environment factors have nonlinear threshold effects on both urban economic and social vitality in Hangzhou,with differing nonlinear response patterns observed between social and economic dimensions.展开更多
The prediction of power grid faults based on meteorological factors is of great significance to reduce economic losses caused by power grid faults. However, the existing methods fail to effectively extract key feature...The prediction of power grid faults based on meteorological factors is of great significance to reduce economic losses caused by power grid faults. However, the existing methods fail to effectively extract key features and accurately predict fault types due to the complexity of meteorological factors and their nonlinear relationships. In response to these challenges, we propose the Feature-Enhanced XGBoost power grid fault prediction method (FE-XGBoost). Specifically, we first combine the gradient boosting decision tree and recursive feature elimination method to extract essential features from meteorological data. Then, we incorporate a piecewise linear chaotic map to enhance the optimization accuracy of the sparrow search algorithm. Finally, we construct an XGBoost-based model for the classification prediction of power grid meteorological faults and optimize the hyperparameters such as the optimal tree depth, optimal learning rate, and optimal number of iterations using an enhanced sparrow search algorithm. Experimental results demonstrate that our method outperforms the baseline models in predicting power grid faults accurately.展开更多
红绿灯位置是道路上行人和车辆的交会点,极大影响着道路结构和交通运行,在城市路网中起着重要的枢纽作用。针对目前红绿灯位置检测方法准确率不够高、覆盖面区域不完整等问题,提出了一种基于轨迹数据的交通灯位置检测方法。该方法基于聚...红绿灯位置是道路上行人和车辆的交会点,极大影响着道路结构和交通运行,在城市路网中起着重要的枢纽作用。针对目前红绿灯位置检测方法准确率不够高、覆盖面区域不完整等问题,提出了一种基于轨迹数据的交通灯位置检测方法。该方法基于聚类-合并-分类-合并的四级模型,首先从清理过的轨迹数据中提取隐含的车辆行驶特征,再采用具有噪声的基于密度的聚类(density-based spatial clustering of applications with noise,DBSCAN)方法得到转向和停驻两类聚类中心,对这两类聚类中心进行合并,获得红绿灯位置的候选位置;根据候选位置一定范围内的轨迹点提取该区域的车流行驶特征,然后采用梯度提升决策树(gradient boosting decision tree,GBDT)算法进行分类,最后将候选位置的正样本融合,以检测红绿灯位置。采用成都市浮动车GPS轨迹数据进行实验,检测结果的F1分数为0.947,效果优于常规的机器学习方法。实验结果表明,基于GPS轨迹数据,采用提出的四层模型能有效检测出红绿灯的位置,该模型可被用于城市大范围红绿灯位置信息的快速获取和更新。展开更多
It is easy for teenagers to view pornographic pictures on social networks. Many researchers have studied the detection of real pornographic pictures, but there are few studies on those that are artificial. In this wor...It is easy for teenagers to view pornographic pictures on social networks. Many researchers have studied the detection of real pornographic pictures, but there are few studies on those that are artificial. In this work, we studied how to detect artificial pornographic pictures, especially when they are on social networks. The whole detection process can be divided into two stages: feature selection and picture detection. In the feature selection stage, seven types of features that favour picture detection were selected. In the picture detection stage, three steps were included. 1) In order to alleviate the imbalance in the number of artificial pornographic pictures and normal ones, the training dataset of artificial pornographic pictures was expanded. Therefore, the features which were extracted from the training dataset can also be expanded too. 2) In order to reduce the time of feature extraction, a fast method which extracted features based on the proportionally scaled picture rather than the original one was proposed. 3) Three tree models were compared and a gradient boost decision tree (GBDT) was selected for the final picture detection. Three sets of experimental results show that the proposed method can achieve better recognition precision and drastically reduce the time cost of the method.展开更多
Sepsis poses a serious threat to health of children in pediatric intensive care unit.The mortality from pediatric sepsis can be effectively reduced through in-time diagnosis and therapeutic intervention.The bacillicul...Sepsis poses a serious threat to health of children in pediatric intensive care unit.The mortality from pediatric sepsis can be effectively reduced through in-time diagnosis and therapeutic intervention.The bacilliculture detection method is too time-consuming to receive timely treatment.In this research,we propose a new framework:a deep encoding network with cross features(CF-DEN)that enables accurate early detection of sepsis.Cross features are automatically constructed via the gradient boosting decision tree and distilled into the deep encoding network(DEN)we designed.The DEN is aimed at learning sufficiently effective representation from clinical test data.Each layer of the DEN fltrates the features involved in computation at current layer via attention mechanism and outputs the current prediction which is additive layer by layer to obtain the embedding feature at last layer.The framework takes the advantage of tree-based method and neural network method to extract effective representation from small clinical dataset and obtain accurate prediction in order to prompt patient to get timely treatment.We evaluate the performance of the framework on the dataset collected from Shanghai Children's Medical Center.Compared with common machine learning methods,our method achieves the increase on F1-score by 16.06%on the test set.展开更多
y consumption efficiency and to increase the crop yield.With the increase of agri-cultural data generated by the Internet of Things(IoT),more feasible models are necessary to get full usage of such information.In this...y consumption efficiency and to increase the crop yield.With the increase of agri-cultural data generated by the Internet of Things(IoT),more feasible models are necessary to get full usage of such information.In this research,a Gradient Boost Decision Tree(GBDT)model based on the newly-developed Light Gradient Boosting Machine algorithm(LightGBM or LGBM)was proposed to model the internal temperature of a greenhouse.Fea-tures including climate variables,control variables and additional temporal information collected within five years were used to construct a suitable dataset to train and validate the LGBM model.An adaptive cross-validation method was developed as a novelty to improve the LGBM model performance and self-adaptive ability.For comparison of the pre-dictive accuracy,a Back-Propagation(BP)Neural Network model and a Recurrent Neural Network(RNN)model were built under the same process.Another two GBDT algorithms,Extreme Gradient Boosting(Xgboost)and Stochastic Gradient Boosting(SGB),were also introduced to compare the predictive accuracy with LGBM model.Results suggest that the LGBM has best fitting ability for the temperature curves with RMSE value at 0.645℃,as well as the fastest training speed among all algorithms with 60 times faster than the other two neural network algorithms.The LGBM has strongly potential application pro-spect on both greenhouse environment prediction and real-time predictive control.展开更多
The agricultural sector’s day-to-day operations,such as irrigation and sowing,are impacted by the weather.Therefore,weather constitutes a key role in all regular human activities.Weather forecasting must be accurate ...The agricultural sector’s day-to-day operations,such as irrigation and sowing,are impacted by the weather.Therefore,weather constitutes a key role in all regular human activities.Weather forecasting must be accurate and precise to plan our activities and safeguard ourselves as well as our property from disasters.Rainfall,wind speed,humidity,wind direction,cloud,temperature,and other weather forecasting variables are used in this work for weather prediction.Many research works have been conducted on weather forecasting.The drawbacks of existing approaches are that they are less effective,inaccurate,and time-consuming.To overcome these issues,this paper proposes an enhanced and reliable weather forecasting technique.As well as developing weather forecasting in remote areas.Weather data analysis and machine learning techniques,such as Gradient Boosting Decision Tree,Random Forest,Naive Bayes Bernoulli,and KNN Algorithm are deployed to anticipate weather conditions.A comparative analysis of result outcome said in determining the number of ensemble methods that may be utilized to improve the accuracy of prediction in weather forecasting.The aim of this study is to demonstrate its ability to predict weather forecasts as soon as possible.Experimental evaluation shows our ensemble technique achieves 95%prediction accuracy.Also,for 1000 nodes it is less than 10 s for prediction,and for 5000 nodes it takes less than 40 s for prediction.展开更多
Aiming at the personalized movie recommendation problem,a recommendation algorithm in-tegrating manifold learning and ensemble learning is studied.In this work,manifold learning is used to reduce the dimension of data...Aiming at the personalized movie recommendation problem,a recommendation algorithm in-tegrating manifold learning and ensemble learning is studied.In this work,manifold learning is used to reduce the dimension of data so that both time and space complexities of the model are mitigated.Meanwhile,gradient boosting decision tree(GBDT)is used to train the target user profile prediction model.Based on the recommendation results,Bayesian optimization algorithm is applied to optimize the recommendation model,which can effectively improve the prediction accuracy.The experimental results show that the proposed algorithm can improve the accuracy of movie recommendation.展开更多
Rapid urbanization has markedly affected urban ecosystem health(EH),making it imperative to explore the relationships between EH and urbanization,as well as to identify the key factors influencing EH.This study addres...Rapid urbanization has markedly affected urban ecosystem health(EH),making it imperative to explore the relationships between EH and urbanization,as well as to identify the key factors influencing EH.This study addresses 2 key research gaps:(a)The traditional pressure–state–response evaluation framework fails to integrate ecosystem service demands and landscape pattern indices and has not formed a comprehensive EH evaluation system.(b)There is a lack of research on investigating the drivers and thresholds of EH across the areas in different spatial relationship between urbanization and EH at the urban scale.Here,taking Wuhan,China,as an example,this study assesses EH utilizing an optimized pressure–state–response evaluation framework.Additionally,bivariate Moran’s I is used to analyze the spatial relationship between EH and urbanization.We use gradient boosting decision trees to flexibly model the nonlinear relationships between influencing factors and EH,while Shapley additive explanations quantify each factor’s contribution,enhancing model interpretability and clarifying their effects on EH.The findings reveal a spatial distribution pattern characterized by lower EH levels in central areas and higher EH levels in periphery areas,with a notable negative spatial correlation between EH and urbanization.The spatial heterogeneity and clustering of EH and urbanization across Wuhan exhibit a ringlike pattern radiating from the center to the periphery.Landscape pattern index and land use are identified as key influencing factors of EH in Wuhan,with substantial regional variation,necessitating targeted environmental protection strategies.This study offers insights into urban planning and policymaking,promoting sustainable urban development.展开更多
The rapid development of digital technologies has driven the emergence and popularization of online-to-offline(O2O)retail,reshaping the retail landscape in urban China.However,spatial distribution characteristics and ...The rapid development of digital technologies has driven the emergence and popularization of online-to-offline(O2O)retail,reshaping the retail landscape in urban China.However,spatial distribution characteristics and influencing mechanisms of emerging O2O retail have not been thoroughly investigated in extant studies.Taking the central urban area of Guangzhou as the case,this study utilized multi-source data and machine learning methods to explore the distribution characteristics of O2O retail space and to further identify the nonlinear effects of the built environment,sociodemographic,and economic factors on its distribution.The results revealed that O2O retail space exhibited a‘single-center’distribution pattern,in contrast to the‘multi-center’distribution pattern of traditional retail space.This finding supported the diffusion of innovation hypothesis,highlighting that the expansion of O2O retail modes first spread from traditional developed retail space.Furthermore,spatial heterogeneities were observed across different types of O2O retail space,with O2O in-store showing a‘core-periphery’spatial structure as described by Central Place Theory,whereas O2O delivery displaying a‘horizontal,non-hierarchical,and multi-centered’network structure following Central Flow Theory.Compared to traditional retail space,the distribution of O2O retail space was more influenced by sociodemographic factors such as the proportion of youth,education level,and income level,but less affected by the built environment factors like office and building density.Furthermore,nonlinear effects of these influencing factors on the distribution of O2O retail space were identified,which enriched the existing literature by highlighting effective ranges and threshold effects.These findings provided valuable insights into O2O retail space development in the context of digital transformation.展开更多
Pedestrian well-being reflects emotional experience during walking.Analyzing which built environment factors influence pedestrian wellbeing not only helps to improve residents’physical and mental health but also enco...Pedestrian well-being reflects emotional experience during walking.Analyzing which built environment factors influence pedestrian wellbeing not only helps to improve residents’physical and mental health but also encourages more walking.Based on the data obtained via a questionnaire survey in Harbin,China,a gradient boosting decision tree(GBDT)model is developed to analyze how the perception of the built environment influences pedestrian well-being and to explain the differences across types of neighborhoods(old,new,and mixed).The results show that pedestrian well-being is most influenced by the diversity of daily service facilities,followed by the number of commercial facilities along a street,the accessibility of daily service facilities,and green spaces.Moreover,pedestrian well-being is also influenced by the type of neighborhoods.In new neighborhoods,it is dominated by the accessibility of public transport stations,while in old and mixed neighborhoods,pedestrian well-being is primarily determined by the accessibility of green spaces and the number of green spaces,respectively.Depending on the characteristics of the built environment,different intervention measures are proposed to improve pedestrian well-being and promote walking.展开更多
Transit-oriented development is extensively employed for the development of metro stations.Service facilities are key components of the regional built environment of metro stations,and the arrangement and design of th...Transit-oriented development is extensively employed for the development of metro stations.Service facilities are key components of the regional built environment of metro stations,and the arrangement and design of these facilities significantly impact regional economic efficiency,particularly housing prices.Data from Tianjin,China are analyzed using the gradient boosting decision tree model to explore the impacts of service facilities on housing prices at various operational stages.The results showed that:(1)the development of metro stations encourages the evolution of housing prices and service facilities,and there is a more pronounced concentration effect of service facility allocation intensity on the urban central circle;(2)the characteristics of service facilities in metro station areas have a nonlinear effect and threshold effect on housing prices;(3)the gradient boosting decision tree model explains a premium effect,which is better than the ordinary least square model;(4)kernel the density is the most significant,and affects housing prices,with per capita quantity decreasing as diversity increases over time.The results encourage government departments to enhance the construction of improved rail transit links and to optimize public service facilities in station areas.展开更多
With accumulating dysregulated circular RNAs(circRNAs)in pathological processes,the regulatory functions of circRNAs,especially circRNAs as microRNA(miRNA)sponges and their interactions with RNA-binding proteins(RBPs)...With accumulating dysregulated circular RNAs(circRNAs)in pathological processes,the regulatory functions of circRNAs,especially circRNAs as microRNA(miRNA)sponges and their interactions with RNA-binding proteins(RBPs),have been widely validated.However,the collected information on experimentally validated circRNA-disease associations is only preliminary.Therefore,an updated CircR2Disease database providing a comprehensive resource and web tool to clarify the relationships between circRNAs and diseases in diverse species is necessary.Here,we present an updated CircR2Disease v2.0 with the increased number of circRNA-disease associations and novel characteristics.CircR2Disease v2.0 provides more than 5-fold experimentally validated circRNA-disease associations compared to its previous version.This version includes 4201 entries between 3077 circRNAs and 312 disease subtypes.Secondly,the information of circRNA-miRNA,circRNA-miRNA-target,and circRNA-RBP interactions has been manually collected for various diseases.Thirdly,the gene symbols of circRNAs and disease name IDs can be linked with various nomenclature databases.Detailed descriptions such as samples and journals have also been integrated into the updated version.Thus,CircR2Disease v2.0 can serve as a platform for users to systematically investigate the roles of dysregulated circRNAs in various diseases and further explore the posttranscriptional regulatory function in diseases.Finally,we propose a computational method named circDis based on the graph convolutional network(GCN)and gradient boosting decision tree(GBDT)to illustrate the applications of the CircR2Disease v2.0 database.CircR2Disease v2.0 is available at http://bioinfo.snnu.edu.cn/CircR2Disease_v2.0 and https://github.com/bioinforlab/CircR2Disease-v2.0.展开更多
As a typical screening apparatus,the elliptically vibrating screen was extensively employed for the size classification of granular materials.Unremitting efforts have been paid on the improvement of sieving performanc...As a typical screening apparatus,the elliptically vibrating screen was extensively employed for the size classification of granular materials.Unremitting efforts have been paid on the improvement of sieving performance,but the optimization problem was still perplexing the researchers due to the complexity of sieving process.In the present paper,the sieving process of elliptically vibrating screen was numerically simulated based on the Discrete Element Method(DEM).The production quality and the processing capacity of vibrating screen were measured by the screening efficiency and the screening time,respectively.The sieving parameters including the length of semi-major axis,the length ratio of two semi-axes,the vibration frequency,the inclination angle,the vibration direction angle and the motion direction of screen deck were investigated.Firstly,the Gradient Boosting Decision Trees(GBDT)algorithm was adopted in the modelling task of screening data.The trained prediction models with sufficient generalization performance were obtained,and the relative importance of six parameters for both the screening indexes was revealed.After that,a hybrid MACO-GBDT algorithm based on the Ant Colony Optimization(ACO)was proposed for optimizing the sieving performance of vibrating screen.Both the single objective optimization of screening efficiency and the stepwise optimization of screening results were conducted.Ultimately,the reliability of the MACO-GBDT algorithm were examined by the numerical experiments.The optimization strategy provided in this work would be helpful for the parameter design and the performance improvement of vibrating screens.展开更多
基金supported by the National Nat-ural Science Foundation of China(No.52203376)the National Key Research and Development Program of China(No.2023YFB3813200).
文摘Traditional 3Ni weathering steel cannot completely meet the requirements for offshore engineering development,resulting in the design of novel 3Ni steel with the addition of microalloy elements such as Mn or Nb for strength enhancement becoming a trend.The stress-assisted corrosion behavior of a novel designed high-strength 3Ni steel was investigated in the current study using the corrosion big data method.The information on the corrosion process was recorded using the galvanic corrosion current monitoring method.The gradi-ent boosting decision tree(GBDT)machine learning method was used to mine the corrosion mechanism,and the importance of the struc-ture factor was investigated.Field exposure tests were conducted to verify the calculated results using the GBDT method.Results indic-ated that the GBDT method can be effectively used to study the influence of structural factors on the corrosion process of 3Ni steel.Dif-ferent mechanisms for the addition of Mn and Cu to the stress-assisted corrosion of 3Ni steel suggested that Mn and Cu have no obvious effect on the corrosion rate of non-stressed 3Ni steel during the early stage of corrosion.When the corrosion reached a stable state,the in-crease in Mn element content increased the corrosion rate of 3Ni steel,while Cu reduced this rate.In the presence of stress,the increase in Mn element content and Cu addition can inhibit the corrosion process.The corrosion law of outdoor-exposed 3Ni steel is consistent with the law based on corrosion big data technology,verifying the reliability of the big data evaluation method and data prediction model selection.
基金The National Natural Science Foundation of China(No.51478114,51778136)
文摘To investigate the travel time prediction method of the freeway, a model based on the gradient boosting decision tree (GBDT) is proposed. Eleven variables (namely, travel time in current period T i , traffic flow in current period Q i , speed in current period V i , density in current period K i , the number of vehicles in current period N i , occupancy in current period R i , traffic state parameter in current period X i , travel time in previous time period T i -1 , etc.) are selected to predict the travel time for 10 min ahead in the proposed model. Data obtained from VISSIM simulation is used to train and test the model. The results demonstrate that the prediction error of the GBDT model is smaller than those of the back propagation (BP) neural network model and the support vector machine (SVM) model. Travel time in current period T i is the most important variable among all variables in the GBDT model. The GBDT model can produce more accurate prediction results and mine the hidden nonlinear relationships deeply between variables and the predicted travel time.
文摘Accurate prediction ofmonthly oil and gas production is essential for oil enterprises tomake reasonable production plans,avoid blind investment and realize sustainable development.Traditional oil well production trend prediction methods are based on years of oil field production experience and expertise,and the application conditions are very demanding.With the rapid development of artificial intelligence technology,big data analysis methods are gradually applied in various sub-fields of the oil and gas reservoir development.Based on the data-driven artificial intelligence algorithmGradient BoostingDecision Tree(GBDT),this paper predicts the initial single-layer production by considering geological data,fluid PVT data and well data.The results show that the GBDT algorithm prediction model has great accuracy,significantly improving efficiency and strong universal applicability.The GBDTmethod trained in this paper can predict production,which is helpful for well site optimization,perforation layer optimization and engineering parameter optimization and has guiding significance for oilfield development.
基金the National Science Foundation of China(Grant No.42177164)the Distinguished Youth Science Foundation of Hunan Province of China(Grant No.2022JJ10073)the Innovation-Driven Project of Central South University(Grant No.2020CX040).
文摘The stability of underground entry-type excavations will directly affect the working environment and the safety of staff.Empirical critical span graphs and traditional statistics learning methods can not meet the requirements of high accuracy for stability assessment of entry-type excavations.Therefore,this study proposes a new prediction method based on machine learning to scientifically adjust the critical span graph.Accordingly,the particle swarm optimization(PSO)algorithm is used to optimize the core parameters of the gradient boosting decision tree(GBDT),abbreviated as PSO-GBDT.Moreover,the classification performance of eight other classifiers including GDBT,k-nearest neighbors(KNN),two kinds of support vector machines(SVM),Gaussian naive Bayes(GNB),logistic regression(LR)and linear discriminant analysis(LDA)are also applied to compare with the proposed model.Findings revealed that compared with the other eight models,the prediction performance of PSO-GBDT is undoubtedly the most reliable,and its classification accuracy is up to 0.93.Therefore,this model has great potential to provide a more scientific and accurate choice for the stability prediction of underground excavations.In addition,each classification model is used to predict the stability category of several grid points divided by the critical span graph,and the updated critical span graph of each model is discussed in combination with previous studies.The results show that the PSO-GBDT model has the advantages of being scientific,accurate and efficient in updating the critical span graph,and its output decision boundary has strict theoretical support,which can help mine operators make favorable economic decisions.
文摘In order to improve the accuracy of target intent recognition,a recognition method based on XGBoost(eXtreme Gradient Boosting)decision tree is proposed.This paper adopts relevant data and program of python to calculate the probability of tactical intention.Then the sequence intention probability is obtained by applying Dempster-Shafer rule of combination.To verify the accuracy of recognition results,we compare the experimental results of this paper with the results in the literatures.The experiment shows that the probability of tactical intention recognition through this method is improved,so this method is feasible.
基金National Social Science Foundation of China,No.20FJLB025National Natural Science Foundation of China,No.42471207,No.42471203Zhejiang Province Philosophy and Social Science Planning,Zhijiang Youth Special Project,24ZJQN118Y。
文摘The effects of the built environment factors on urban vitality have attracted wide attention in the urban planning fields in recent years,but few studies have considered the variables’relative importance and their nonlinear effects on urban vitality.Taking a Chinese metropolis—Hangzhou as a case study,this study applied the gradient boosting decision tree(GBDT)model to explore the nonlinear effects of the 5D factors of the urban built environment on urban social vitality and economic vitality and the importance of variables.The results show that the GBDT model has better goodness of fit than the traditional ordinary least squares(OLS)regression in the urban vitality models.The urban built environment plays an important role in affecting urban vitality,while built environment designs witness the most important effect.Specifically,the density of shopping facilities,medical facilities,and road networks are the most important factors affecting urban social vitality,while road network density,destination accessibility,and population density play the most important roles in affecting urban economic vitality.Finally,the urban built environment factors have nonlinear threshold effects on both urban economic and social vitality in Hangzhou,with differing nonlinear response patterns observed between social and economic dimensions.
基金supported by the Science and Technology Project of State Grid Jiangsu Electric Power Co.,Ltd.(Research on Power Meteorology Digitalization Application for Future Climate Scenarios and New Energy Operation Risks,J2023076).
文摘The prediction of power grid faults based on meteorological factors is of great significance to reduce economic losses caused by power grid faults. However, the existing methods fail to effectively extract key features and accurately predict fault types due to the complexity of meteorological factors and their nonlinear relationships. In response to these challenges, we propose the Feature-Enhanced XGBoost power grid fault prediction method (FE-XGBoost). Specifically, we first combine the gradient boosting decision tree and recursive feature elimination method to extract essential features from meteorological data. Then, we incorporate a piecewise linear chaotic map to enhance the optimization accuracy of the sparrow search algorithm. Finally, we construct an XGBoost-based model for the classification prediction of power grid meteorological faults and optimize the hyperparameters such as the optimal tree depth, optimal learning rate, and optimal number of iterations using an enhanced sparrow search algorithm. Experimental results demonstrate that our method outperforms the baseline models in predicting power grid faults accurately.
文摘红绿灯位置是道路上行人和车辆的交会点,极大影响着道路结构和交通运行,在城市路网中起着重要的枢纽作用。针对目前红绿灯位置检测方法准确率不够高、覆盖面区域不完整等问题,提出了一种基于轨迹数据的交通灯位置检测方法。该方法基于聚类-合并-分类-合并的四级模型,首先从清理过的轨迹数据中提取隐含的车辆行驶特征,再采用具有噪声的基于密度的聚类(density-based spatial clustering of applications with noise,DBSCAN)方法得到转向和停驻两类聚类中心,对这两类聚类中心进行合并,获得红绿灯位置的候选位置;根据候选位置一定范围内的轨迹点提取该区域的车流行驶特征,然后采用梯度提升决策树(gradient boosting decision tree,GBDT)算法进行分类,最后将候选位置的正样本融合,以检测红绿灯位置。采用成都市浮动车GPS轨迹数据进行实验,检测结果的F1分数为0.947,效果优于常规的机器学习方法。实验结果表明,基于GPS轨迹数据,采用提出的四层模型能有效检测出红绿灯的位置,该模型可被用于城市大范围红绿灯位置信息的快速获取和更新。
基金Projects(61573380,61303185) supported by the National Natural Science Foundation of ChinaProjects(2016M592450,2017M612585) supported by the China Postdoctoral Science FoundationProjects(2016JJ4119,2017JJ3416) supported by the Hunan Provincial Natural Science Foundation of China
文摘It is easy for teenagers to view pornographic pictures on social networks. Many researchers have studied the detection of real pornographic pictures, but there are few studies on those that are artificial. In this work, we studied how to detect artificial pornographic pictures, especially when they are on social networks. The whole detection process can be divided into two stages: feature selection and picture detection. In the feature selection stage, seven types of features that favour picture detection were selected. In the picture detection stage, three steps were included. 1) In order to alleviate the imbalance in the number of artificial pornographic pictures and normal ones, the training dataset of artificial pornographic pictures was expanded. Therefore, the features which were extracted from the training dataset can also be expanded too. 2) In order to reduce the time of feature extraction, a fast method which extracted features based on the proportionally scaled picture rather than the original one was proposed. 3) Three tree models were compared and a gradient boost decision tree (GBDT) was selected for the final picture detection. Three sets of experimental results show that the proposed method can achieve better recognition precision and drastically reduce the time cost of the method.
文摘Sepsis poses a serious threat to health of children in pediatric intensive care unit.The mortality from pediatric sepsis can be effectively reduced through in-time diagnosis and therapeutic intervention.The bacilliculture detection method is too time-consuming to receive timely treatment.In this research,we propose a new framework:a deep encoding network with cross features(CF-DEN)that enables accurate early detection of sepsis.Cross features are automatically constructed via the gradient boosting decision tree and distilled into the deep encoding network(DEN)we designed.The DEN is aimed at learning sufficiently effective representation from clinical test data.Each layer of the DEN fltrates the features involved in computation at current layer via attention mechanism and outputs the current prediction which is additive layer by layer to obtain the embedding feature at last layer.The framework takes the advantage of tree-based method and neural network method to extract effective representation from small clinical dataset and obtain accurate prediction in order to prompt patient to get timely treatment.We evaluate the performance of the framework on the dataset collected from Shanghai Children's Medical Center.Compared with common machine learning methods,our method achieves the increase on F1-score by 16.06%on the test set.
基金This work was supported in part by Shanghai Agriculture Applied Technology Development Program,China(Grant No.G 2020-02-08-00-07-F01480)Shanghai Municipal Science and Technology Commission Innovation Action Plan(Grant No.17391900900)National Natural Science Foundation of China(Grant No.61573258).
文摘y consumption efficiency and to increase the crop yield.With the increase of agri-cultural data generated by the Internet of Things(IoT),more feasible models are necessary to get full usage of such information.In this research,a Gradient Boost Decision Tree(GBDT)model based on the newly-developed Light Gradient Boosting Machine algorithm(LightGBM or LGBM)was proposed to model the internal temperature of a greenhouse.Fea-tures including climate variables,control variables and additional temporal information collected within five years were used to construct a suitable dataset to train and validate the LGBM model.An adaptive cross-validation method was developed as a novelty to improve the LGBM model performance and self-adaptive ability.For comparison of the pre-dictive accuracy,a Back-Propagation(BP)Neural Network model and a Recurrent Neural Network(RNN)model were built under the same process.Another two GBDT algorithms,Extreme Gradient Boosting(Xgboost)and Stochastic Gradient Boosting(SGB),were also introduced to compare the predictive accuracy with LGBM model.Results suggest that the LGBM has best fitting ability for the temperature curves with RMSE value at 0.645℃,as well as the fastest training speed among all algorithms with 60 times faster than the other two neural network algorithms.The LGBM has strongly potential application pro-spect on both greenhouse environment prediction and real-time predictive control.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number(RGP 2/42/43)Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R135),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The agricultural sector’s day-to-day operations,such as irrigation and sowing,are impacted by the weather.Therefore,weather constitutes a key role in all regular human activities.Weather forecasting must be accurate and precise to plan our activities and safeguard ourselves as well as our property from disasters.Rainfall,wind speed,humidity,wind direction,cloud,temperature,and other weather forecasting variables are used in this work for weather prediction.Many research works have been conducted on weather forecasting.The drawbacks of existing approaches are that they are less effective,inaccurate,and time-consuming.To overcome these issues,this paper proposes an enhanced and reliable weather forecasting technique.As well as developing weather forecasting in remote areas.Weather data analysis and machine learning techniques,such as Gradient Boosting Decision Tree,Random Forest,Naive Bayes Bernoulli,and KNN Algorithm are deployed to anticipate weather conditions.A comparative analysis of result outcome said in determining the number of ensemble methods that may be utilized to improve the accuracy of prediction in weather forecasting.The aim of this study is to demonstrate its ability to predict weather forecasts as soon as possible.Experimental evaluation shows our ensemble technique achieves 95%prediction accuracy.Also,for 1000 nodes it is less than 10 s for prediction,and for 5000 nodes it takes less than 40 s for prediction.
基金Supported by the Educational Commission of Liaoning Province of China(No.LQGD2017027).
文摘Aiming at the personalized movie recommendation problem,a recommendation algorithm in-tegrating manifold learning and ensemble learning is studied.In this work,manifold learning is used to reduce the dimension of data so that both time and space complexities of the model are mitigated.Meanwhile,gradient boosting decision tree(GBDT)is used to train the target user profile prediction model.Based on the recommendation results,Bayesian optimization algorithm is applied to optimize the recommendation model,which can effectively improve the prediction accuracy.The experimental results show that the proposed algorithm can improve the accuracy of movie recommendation.
基金financially supported by the National Natural Science Foundation of China(Nos.72174158 and 72474164)the Fundamental Research Funds for the Central Universities of China(No.2042023kf0222)
文摘Rapid urbanization has markedly affected urban ecosystem health(EH),making it imperative to explore the relationships between EH and urbanization,as well as to identify the key factors influencing EH.This study addresses 2 key research gaps:(a)The traditional pressure–state–response evaluation framework fails to integrate ecosystem service demands and landscape pattern indices and has not formed a comprehensive EH evaluation system.(b)There is a lack of research on investigating the drivers and thresholds of EH across the areas in different spatial relationship between urbanization and EH at the urban scale.Here,taking Wuhan,China,as an example,this study assesses EH utilizing an optimized pressure–state–response evaluation framework.Additionally,bivariate Moran’s I is used to analyze the spatial relationship between EH and urbanization.We use gradient boosting decision trees to flexibly model the nonlinear relationships between influencing factors and EH,while Shapley additive explanations quantify each factor’s contribution,enhancing model interpretability and clarifying their effects on EH.The findings reveal a spatial distribution pattern characterized by lower EH levels in central areas and higher EH levels in periphery areas,with a notable negative spatial correlation between EH and urbanization.The spatial heterogeneity and clustering of EH and urbanization across Wuhan exhibit a ringlike pattern radiating from the center to the periphery.Landscape pattern index and land use are identified as key influencing factors of EH in Wuhan,with substantial regional variation,necessitating targeted environmental protection strategies.This study offers insights into urban planning and policymaking,promoting sustainable urban development.
基金Under the auspices of National Natural Science Foundation of China(No.42271206)Guangdong Philosophy and Social Science Foundation(No.GD22XGL08)+1 种基金Basic and Applied Basic Research Foundation of Guangzhou(No.2024A04J4541)the Fundamental Research Funds for the Central Universities(No.2024ZYGXZR025,2024ZYGXZR003)。
文摘The rapid development of digital technologies has driven the emergence and popularization of online-to-offline(O2O)retail,reshaping the retail landscape in urban China.However,spatial distribution characteristics and influencing mechanisms of emerging O2O retail have not been thoroughly investigated in extant studies.Taking the central urban area of Guangzhou as the case,this study utilized multi-source data and machine learning methods to explore the distribution characteristics of O2O retail space and to further identify the nonlinear effects of the built environment,sociodemographic,and economic factors on its distribution.The results revealed that O2O retail space exhibited a‘single-center’distribution pattern,in contrast to the‘multi-center’distribution pattern of traditional retail space.This finding supported the diffusion of innovation hypothesis,highlighting that the expansion of O2O retail modes first spread from traditional developed retail space.Furthermore,spatial heterogeneities were observed across different types of O2O retail space,with O2O in-store showing a‘core-periphery’spatial structure as described by Central Place Theory,whereas O2O delivery displaying a‘horizontal,non-hierarchical,and multi-centered’network structure following Central Flow Theory.Compared to traditional retail space,the distribution of O2O retail space was more influenced by sociodemographic factors such as the proportion of youth,education level,and income level,but less affected by the built environment factors like office and building density.Furthermore,nonlinear effects of these influencing factors on the distribution of O2O retail space were identified,which enriched the existing literature by highlighting effective ranges and threshold effects.These findings provided valuable insights into O2O retail space development in the context of digital transformation.
基金the National Natural Science Foundation of China(Grant Nos.51878204,52278057).
文摘Pedestrian well-being reflects emotional experience during walking.Analyzing which built environment factors influence pedestrian wellbeing not only helps to improve residents’physical and mental health but also encourages more walking.Based on the data obtained via a questionnaire survey in Harbin,China,a gradient boosting decision tree(GBDT)model is developed to analyze how the perception of the built environment influences pedestrian well-being and to explain the differences across types of neighborhoods(old,new,and mixed).The results show that pedestrian well-being is most influenced by the diversity of daily service facilities,followed by the number of commercial facilities along a street,the accessibility of daily service facilities,and green spaces.Moreover,pedestrian well-being is also influenced by the type of neighborhoods.In new neighborhoods,it is dominated by the accessibility of public transport stations,while in old and mixed neighborhoods,pedestrian well-being is primarily determined by the accessibility of green spaces and the number of green spaces,respectively.Depending on the characteristics of the built environment,different intervention measures are proposed to improve pedestrian well-being and promote walking.
基金the sponsorship from the National Natural Science Foundation of China(No.52172304)the Social Science Foundation of Hebei Province(No.HB22YJ040)the Colleges and Universities in Hebei Province Science and Technology Research Project(No.ZD2022123)。
文摘Transit-oriented development is extensively employed for the development of metro stations.Service facilities are key components of the regional built environment of metro stations,and the arrangement and design of these facilities significantly impact regional economic efficiency,particularly housing prices.Data from Tianjin,China are analyzed using the gradient boosting decision tree model to explore the impacts of service facilities on housing prices at various operational stages.The results showed that:(1)the development of metro stations encourages the evolution of housing prices and service facilities,and there is a more pronounced concentration effect of service facility allocation intensity on the urban central circle;(2)the characteristics of service facilities in metro station areas have a nonlinear effect and threshold effect on housing prices;(3)the gradient boosting decision tree model explains a premium effect,which is better than the ordinary least square model;(4)kernel the density is the most significant,and affects housing prices,with per capita quantity decreasing as diversity increases over time.The results encourage government departments to enhance the construction of improved rail transit links and to optimize public service facilities in station areas.
基金supported by the National Natural Science Foundation of China (Grant Nos. 61972451, 61672334, and61902230)the Fundamental Research Funds for the Central Universities (Grant No. GK201901010)
文摘With accumulating dysregulated circular RNAs(circRNAs)in pathological processes,the regulatory functions of circRNAs,especially circRNAs as microRNA(miRNA)sponges and their interactions with RNA-binding proteins(RBPs),have been widely validated.However,the collected information on experimentally validated circRNA-disease associations is only preliminary.Therefore,an updated CircR2Disease database providing a comprehensive resource and web tool to clarify the relationships between circRNAs and diseases in diverse species is necessary.Here,we present an updated CircR2Disease v2.0 with the increased number of circRNA-disease associations and novel characteristics.CircR2Disease v2.0 provides more than 5-fold experimentally validated circRNA-disease associations compared to its previous version.This version includes 4201 entries between 3077 circRNAs and 312 disease subtypes.Secondly,the information of circRNA-miRNA,circRNA-miRNA-target,and circRNA-RBP interactions has been manually collected for various diseases.Thirdly,the gene symbols of circRNAs and disease name IDs can be linked with various nomenclature databases.Detailed descriptions such as samples and journals have also been integrated into the updated version.Thus,CircR2Disease v2.0 can serve as a platform for users to systematically investigate the roles of dysregulated circRNAs in various diseases and further explore the posttranscriptional regulatory function in diseases.Finally,we propose a computational method named circDis based on the graph convolutional network(GCN)and gradient boosting decision tree(GBDT)to illustrate the applications of the CircR2Disease v2.0 database.CircR2Disease v2.0 is available at http://bioinfo.snnu.edu.cn/CircR2Disease_v2.0 and https://github.com/bioinforlab/CircR2Disease-v2.0.
基金The research work is financially supported by National Natural Science Foundation of China(No.51775113)Natural Science Foundation of Fujian Province(No.2017J01675)+2 种基金51st Scientific Research Fund Program of Fujian University of Technology(No.GY-Z160139)Key Research Platform of NC Equipment and Technology in Fujian Province(No.2014H2002)Subsidized Project for Postgraduates’Innovative Fund in Scientific Research of Huaqiao University(No.17013080007).
文摘As a typical screening apparatus,the elliptically vibrating screen was extensively employed for the size classification of granular materials.Unremitting efforts have been paid on the improvement of sieving performance,but the optimization problem was still perplexing the researchers due to the complexity of sieving process.In the present paper,the sieving process of elliptically vibrating screen was numerically simulated based on the Discrete Element Method(DEM).The production quality and the processing capacity of vibrating screen were measured by the screening efficiency and the screening time,respectively.The sieving parameters including the length of semi-major axis,the length ratio of two semi-axes,the vibration frequency,the inclination angle,the vibration direction angle and the motion direction of screen deck were investigated.Firstly,the Gradient Boosting Decision Trees(GBDT)algorithm was adopted in the modelling task of screening data.The trained prediction models with sufficient generalization performance were obtained,and the relative importance of six parameters for both the screening indexes was revealed.After that,a hybrid MACO-GBDT algorithm based on the Ant Colony Optimization(ACO)was proposed for optimizing the sieving performance of vibrating screen.Both the single objective optimization of screening efficiency and the stepwise optimization of screening results were conducted.Ultimately,the reliability of the MACO-GBDT algorithm were examined by the numerical experiments.The optimization strategy provided in this work would be helpful for the parameter design and the performance improvement of vibrating screens.