Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support v...Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support vector machine(SVM),as well as ensemble methods,such as Gradient Boosting and eXtreme gradient boosting(XGBoost),are often plagued by high computational costs,which makes it challenging for them to perform real-time detection.In this regard,we suggested an attack detection approach that integrates Visual Geometry Group 16(VGG16),Artificial Rabbits Optimizer(ARO),and Random Forest Model to increase detection accuracy and operational efficiency in Internet of Things(IoT)networks.In the suggested model,the extraction of features from malware pictures was accomplished with the help of VGG16.The prediction process is carried out by the random forest model using the extracted features from the VGG16.Additionally,ARO is used to improve the hyper-parameters of the random forest model of the random forest.With an accuracy of 96.36%,the suggested model outperforms the standard models in terms of accuracy,F1-score,precision,and recall.The comparative research highlights our strategy’s success,which improves performance while maintaining a lower computational cost.This method is ideal for real-time applications,but it is effective.展开更多
Random forest model is the mainstream research method used to accurately describe the distribution law and impact mechanism of regional population.We took Shijiazhuang as the research area,with comprehensive zoning ba...Random forest model is the mainstream research method used to accurately describe the distribution law and impact mechanism of regional population.We took Shijiazhuang as the research area,with comprehensive zoning based on endowments as the modeling unit,conducted stratified sampling on a hectare grid cell,and systematically carried out incremental selection experiments of population density impact factors,optimizing the population density random forest model throughout the process(zonal modeling,stratified sampling,factor selection,weighted output).The results are as follows:(1)Zonal modeling addresses the issue of confusion in population distribution laws caused by a single model.Sampling on a grid cell not only ensures the quality of training data by avoiding the modifiable areal unit problem(MAUP)but also attempts to mitigate the adverse effects of the ecological fallacy.Stratified sampling ensures the stability of population density label values(target variable)in the training sample.(2)Zonal selection experiments on population density impact factors help identify suitable combinations of factors,leading to a significant improvement in the goodness of fit(R^(2))of the zonal models.(3)Weighted combination output of the population density prediction dataset substantially enhances the model's robustness.(4)The population density dataset exhibits multi-scale superposition characteristics.On a large scale,the population density in plains is higher than that in mountainous areas,while on a small scale,urban areas have higher density compared to rural areas.The optimization scheme for the population density random forest model that we propose offers a unified technical framework for uncovering local population distribution law and the impact mechanisms.展开更多
To improve the efficiency of air quality analysis and the accuracy of predictions, this paper proposes a composite method based on Vector Autoregressive (VAR) and Random Forest (RF) models. In the theoretical section,...To improve the efficiency of air quality analysis and the accuracy of predictions, this paper proposes a composite method based on Vector Autoregressive (VAR) and Random Forest (RF) models. In the theoretical section, the model introduction and estimation algorithms are provided. In the empirical analysis section, global air quality data from 2022 to 2024 are used, and the proposed method is applied. Specifically, principal component analysis (PCA) is first conducted, and then VAR and Random Forest methods are used for prediction on the reduced-dimensional data. The results show that the RMSE of the hybrid model is 45.27, significantly lower than the 49.11 of the VAR model alone, verifying its superiority. The stability and predictive performance of the model are effectively enhanced.展开更多
Potential of the Random Forest Model on mapping of different desertification processes was studied in Muttuma watershed of mid-Murrumbidgee river region of New South Wales,Australia.Desertification vulnerability index...Potential of the Random Forest Model on mapping of different desertification processes was studied in Muttuma watershed of mid-Murrumbidgee river region of New South Wales,Australia.Desertification vulnerability index was developed using climate,terrain,vegetation,soil and land quality indices to identify environmentally sensitive areas for desertification.Random Forest Model(RFM)was used to predict the different desertification processes such as soil erosion,salinization and waterlogging in the watershed and the information needed to train classification algorithms was obtained from satellite imagery interpretation and ground truth data.Climatic factors(evaporation,rainfall,temperature),terrain factors(aspect,slope,slope length,steepness,and wetness index),soil properties(pH,organic carbon,clay and sand content)and vulnerability indices were used as an explanatory variable.Classification accuracy and kappa index were calculated for training and testing datasets.We recorded an overall accuracy rate of 87.7%and 72.1%for training and testing sites,respectively.We found larger discrepancies between overall accuracy rate and kappa index for testing datasets(72.2%and 27.5%,respectively)suggesting that all the classes are not predicted well.The prediction of soil erosion and no desertification process was good and poor for salinization and water-logging process.Overall,the results observed give a new idea of using the knowledge of desertification process in training areas that can be used to predict the desertification processes at unvisited areas.展开更多
Objective Body fluid mixtures are complex biological samples that frequently occur in crime scenes,and can provide important clues for criminal case analysis.DNA methylation assay has been applied in the identificatio...Objective Body fluid mixtures are complex biological samples that frequently occur in crime scenes,and can provide important clues for criminal case analysis.DNA methylation assay has been applied in the identification of human body fluids,and has exhibited excellent performance in predicting single-source body fluids.The present study aims to develop a methylation SNaPshot multiplex system for body fluid identification,and accurately predict the mixture samples.In addition,the value of DNA methylation in the prediction of body fluid mixtures was further explored.Methods In the present study,420 samples of body fluid mixtures and 250 samples of single body fluids were tested using an optimized multiplex methylation system.Each kind of body fluid sample presented the specific methylation profiles of the 10 markers.Results Significant differences in methylation levels were observed between the mixtures and single body fluids.For all kinds of mixtures,the Spearman’s correlation analysis revealed a significantly strong correlation between the methylation levels and component proportions(1:20,1:10,1:5,1:1,5:1,10:1 and 20:1).Two random forest classification models were trained for the prediction of mixture types and the prediction of the mixture proportion of 2 components,based on the methylation levels of 10 markers.For the mixture prediction,Model-1 presented outstanding prediction accuracy,which reached up to 99.3%in 427 training samples,and had a remarkable accuracy of 100%in 243 independent test samples.For the mixture proportion prediction,Model-2 demonstrated an excellent accuracy of 98.8%in 252 training samples,and 98.2%in 168 independent test samples.The total prediction accuracy reached 99.3%for body fluid mixtures and 98.6%for the mixture proportions.Conclusion These results indicate the excellent capability and powerful value of the multiplex methylation system in the identification of forensic body fluid mixtures.展开更多
This study investigates the application of machine learning models to address after-sales service issues in cross-border e-commerce,focusing on predicting order returns to reduce return costs and optimize customer exp...This study investigates the application of machine learning models to address after-sales service issues in cross-border e-commerce,focusing on predicting order returns to reduce return costs and optimize customer experience.Using H cross-border e-commerce company as a case study,the research employs Random Forest and XGBoost models to identify high-risk return orders.By comparing the performance of these two models,the study highlights their respective strengths and weaknesses and proposes optimization strategies.The findings provide a valuable reference for e-commerce companies to refine their business models,reduce return rates,improve operational efficiency,and enhance customer satisfaction.展开更多
The Tarim River Basin(TRB)is a vast area with plenty of light and heat and is an important base for grain and cotton production in Northwest China.In the context of climate change,however,the increased frequency of ex...The Tarim River Basin(TRB)is a vast area with plenty of light and heat and is an important base for grain and cotton production in Northwest China.In the context of climate change,however,the increased frequency of extreme weather and climate events is having numerous negative impacts on the region's agricultural production.To better understand how unfavorable climatic conditions affect crop production,we explored the relationship of extreme weather and climate events with crop yields and phenology.In this research,ten indicators of extreme weather and climate events(consecutive dry days(CDD),min Tmax(TXn),max Tmin(TNx),tropical nights(TR),warm days(Tx90p),warm nights(Tn90p),summer days(SU),frost days(FD),very wet days(R95p),and windy days(WD))were selected to analyze the impact of spatial and temporal variations on the yields of major crops(wheat,maize,and cotton)in the TRB from 1990 to 2020.The three key findings of this research were as follows:extreme temperatures in southwestern TRB showed an increasing trend,with higher extreme temperatures at night,while the occurrence of extreme weather and climate events in northeastern TRB was relatively low.The number of FD was on the rise,while WD also increased in recent years.Crop yields were higher in the northeast compared with the southwest,and wheat,maize,and cotton yields generally showed an increasing trend despite an earlier decline.The correlation of extreme weather and climate events on crop yields can be categorized as extreme nighttime temperature indices(TNx,Tn90p,TR,and FD),extreme daytime temperature indices(TXn,Tx90p,and SU),extreme precipitation indices(CDD and R95p),and extreme wind(WD).By using Random Forest(RF)approach to determine the effects of different extreme weather and climate events on the yields of different crops,we found that the importance of extreme precipitation indices(CDD and R95p)to crop yield decreased significantly over time.As well,we found that the importance of the extreme nighttime temperature(TR and TNx)for the yields of the three crops increased during 2005-2020 compared with 1990-2005.The impact of extreme temperature events on wheat,maize,and cotton yields in the TRB is becoming increasingly significant,and this finding can inform policy decisions and agronomic innovations to better cope with current and future climate warming.展开更多
Understanding how environmental adaptation varies among families within a species is critical to adapt forestry activities such as management and breeding to possible future climate change.The present study examined h...Understanding how environmental adaptation varies among families within a species is critical to adapt forestry activities such as management and breeding to possible future climate change.The present study examined home-site advantage and local advantage in growth and basic density of wood in 36 families of Chamaecyparis obtuse(Siebold et Zucc.)Endl.,reciprocally planted at two progeny test sites with differing climatic conditions in Japan.A significant home-site advantage for growth was detected between the lowland and mountainous regions within the Kanto breeding region.In addition,the effects of climate differentials between the selection site of mating parents and the progeny test site on growth and basic density were inves-tigated.As a result,temperature was identified as the most significant climatic factor attributed to local adaptation for growth traits.Elongation and radial growth were adversely influenced when the progeny test site temperature exceeded the provenance temperature by more than 2°C.Therefore,it is crucial to account for temperature differences between the provenance and the planting site to adapt afforestation and forest tree breeding to climate change in the future.展开更多
Fires are one of the most destructive natural disasters and have serious long-term effects on the environment,economy,and human health.In Inner Mongolia Autonomous Region,China,frequent fire disturbance occurs due to ...Fires are one of the most destructive natural disasters and have serious long-term effects on the environment,economy,and human health.In Inner Mongolia Autonomous Region,China,frequent fire disturbance occurs due to the intensification of climate change and human activities.It is crucial to understand the fire regime and estimate the probability of regional fire occurrence and reducing fire losses.However,most studies have primarily focused on the dynamic changes,probability of occurrence,and driving mechanisms of wildfires in the grassland and forest land ecosystems in Inner Mongolia,while insufficient research has been conducted on the spatiotemporal variations in active fires and their impact on the wildfire risk in forest land and grassland.Therefore,in this study,we analyzed the active fire regime based on Moderate Resolution Imaging Spectroradiometer(MODIS)thermal anomalies and burned area products from 2000 to 2022.Combined with climate,topographic,landscape,anthropogenic,and vegetation datasets,logistic regression(LR),support vector machine(SVM),random forest(RF),and convolutional neural network(CNN)models were chosen to estimate the probability of active fire occurrence at the seasonal timescale.The results revealed that:(1)a total of 100,343 active fires occurred in Inner Mongolia and the burned area reached 6.59×104 km².The number of ignition point exhibited a significant increasing trend,while the burned area exhibited a nonsignificant decreasing trend;(2)four active fire belts were detected,namely,the Hetao-Tumochuan Plain fire belt,Xiliao River Plain fire belt,Songnen Plain fire belt,and Hailar River Eroded Plain fire belt.The centroid of the active fires has shifted 456.4 km toward the southwest;(3)RF model achieved the highest accuracy in estimating the probability of active fire occurrence,followed by CNN,and LR and SVM models had lower accuracies;and(4)the distribution of the high and extremely high fire risk areas largely aligned with the four fire belts.The probability of active fire occurrence was the highest in spring,followed by that in autumn,and it gradually decreased in summer and winter.Our results revealed active fires migrated to the southwest and ignition sources increased,despite reduction of the burned area was not significant.The RF model outperformed the other models in predicting the probability of active fire occurrence.These findings contribute to future fire prevention and prediction in Inner Mongolia.展开更多
Recently,the outbreak and spread of larch caterpillar(Dendrolimus superans)pests have emerged as significant contributors to forest degradation in the Changbai Mountains,China.Understanding the spatiotemporal distribu...Recently,the outbreak and spread of larch caterpillar(Dendrolimus superans)pests have emerged as significant contributors to forest degradation in the Changbai Mountains,China.Understanding the spatiotemporal distribution patterns of these pests is crucial for effective management and protection of forest ecosystems.This study proposes a pest monitoring approach based on Sentinel imagery.Through time-series analysis,we extracted pest-sensitive features and developed a random forest classifier that integrated Sentinel-1,Sentinel-2,and field sampling data from 2019–2023 to monitor larch caterpillar pests in the Changbai Mountains National Nature Reserve(CMNNR),Northeast China.Our findings indicated that bands green(B3),near-infrared(B8),short wave infrared(B11 and B12)from Sentinel-2 remote sensing images exhibited notable discriminative capabilities for identifying larch caterpillar pests.Specifically,the Normalized Difference Vegetation Index(NDVI)at the end of the growing season emerged as the most valuable feature for pest extraction.Incorporating Synthetic Aperture Radar(SAR)features along with optical data marginally enhances model performance.Furthermore,our approach unveiled the outbreak of larch caterpillar pests,achieving classification map with overall accuracy exceeding 85%and Kappa coefficient surpassing 0.8 for five study years.The pest outbreak began in 2019 and progressively intensified over time.In September 2019,the affected area spanned 114.23 km^(2).The infested area exhibited a declining trend from 2020 to 2023.This study introduces a novel method for the high-precision identification of larch caterpillar pests,offering technical advancements and theoretical underpinnings to support forest management strategies.展开更多
COVID-19,being the virus of fear and anxiety,is one of the most recent and emergent of various respiratory disorders.It is similar to the MERS-COV and SARS-COV,the viruses that affected a large population of different...COVID-19,being the virus of fear and anxiety,is one of the most recent and emergent of various respiratory disorders.It is similar to the MERS-COV and SARS-COV,the viruses that affected a large population of different countries in the year 2012 and 2002,respectively.Various standard models have been used for COVID-19 epidemic prediction but they suffered from low accuracy due to lesser data availability and a high level of uncertainty.The proposed approach used a machine learning-based time-series Facebook NeuralProphet model for prediction of the number of death as well as confirmed cases and compared it with Poisson Distribution,and Random Forest Model.The analysis upon dataset has been performed considering the time duration from January 1st 2020 to16th July 2021.The model has been developed to obtain the forecast values till September 2021.This study aimed to determine the pandemic prediction of COVID-19 in the second wave of coronavirus in India using the latest Time-Series model to observe and predict the coronavirus pandemic situation across the country.In India,the cases are rapidly increasing day-by-day since mid of Feb 2021.The prediction of death rate using the proposed model has a good ability to forecast the COVID-19 dataset essentially in the second wave.To empower the prediction for future validation,the proposed model works effectively.展开更多
BACKGROUND Gestational diabetes mellitus(GDM)is a condition characterized by high blood sugar levels during pregnancy.The prevalence of GDM is on the rise globally,and this trend is particularly evident in China,which...BACKGROUND Gestational diabetes mellitus(GDM)is a condition characterized by high blood sugar levels during pregnancy.The prevalence of GDM is on the rise globally,and this trend is particularly evident in China,which has emerged as a significant issue impacting the well-being of expectant mothers and their fetuses.Identifying and addressing GDM in a timely manner is crucial for maintaining the health of both expectant mothers and their developing fetuses.Therefore,this study aims to establish a risk prediction model for GDM and explore the effects of serum ferritin,blood glucose,and body mass index(BMI)on the occurrence of GDM.AIM To develop a risk prediction model to analyze factors leading to GDM,and evaluate its efficiency for early prevention.METHODS The clinical data of 406 pregnant women who underwent routine prenatal examination in Fujian Maternity and Child Health Hospital from April 2020 to December 2022 were retrospectively analyzed.According to whether GDM occurred,they were divided into two groups to analyze the related factors affecting GDM.Then,according to the weight of the relevant risk factors,the training set and the verification set were divided at a ratio of 7:3.Subsequently,a risk prediction model was established using logistic regression and random forest models,and the model was evaluated and verified.RESULTS Pre-pregnancy BMI,previous history of GDM or macrosomia,hypertension,hemoglobin(Hb)level,triglyceride level,family history of diabetes,serum ferritin,and fasting blood glucose levels during early pregnancy were determined.These factors were found to have a significant impact on the development of GDM(P<0.05).According to the nomogram model’s prediction of GDM in pregnancy,the area under the curve(AUC)was determined to be 0.883[95%confidence interval(CI):0.846-0.921],and the sensitivity and specificity were 74.1%and 87.6%,respectively.The top five variables in the random forest model for predicting the occurrence of GDM were serum ferritin,fasting blood glucose in early pregnancy,pre-pregnancy BMI,Hb level and triglyceride level.The random forest model achieved an AUC of 0.950(95%CI:0.927-0.973),the sensitivity was 84.8%,and the specificity was 91.4%.The Delong test showed that the AUC value of the random forest model was higher than that of the decision tree model(P<0.05).CONCLUSION The random forest model is superior to the nomogram model in predicting the risk of GDM.This method is helpful for early diagnosis and appropriate intervention of GDM.展开更多
BACKGROUND Type 2 diabetes mellitus(T2DM)is associated with periodontitis.Currently,there are few studies proposing predictive models for periodontitis in patients with T2DM.AIM To determine the factors influencing pe...BACKGROUND Type 2 diabetes mellitus(T2DM)is associated with periodontitis.Currently,there are few studies proposing predictive models for periodontitis in patients with T2DM.AIM To determine the factors influencing periodontitis in patients with T2DM by constructing logistic regression and random forest models.METHODS In this a retrospective study,300 patients with T2DM who were hospitalized at the First People’s Hospital of Wenling from January 2022 to June 2022 were selected for inclusion,and their data were collected from hospital records.We used logistic regression to analyze factors associated with periodontitis in patients with T2DM,and random forest and logistic regression prediction models were established.The prediction efficiency of the models was compared using the area under the receiver operating characteristic curve(AUC).RESULTS Of 300 patients with T2DM,224 had periodontitis,with an incidence of 74.67%.Logistic regression analysis showed that age[odds ratio(OR)=1.047,95%confidence interval(CI):1.017-1.078],teeth brushing frequency(OR=4.303,95%CI:2.154-8.599),education level(OR=0.528,95%CI:0.348-0.800),glycosylated hemoglobin(HbA1c)(OR=2.545,95%CI:1.770-3.661),total cholesterol(TC)(OR=2.872,95%CI:1.725-4.781),and triglyceride(TG)(OR=3.306,95%CI:1.019-10.723)influenced the occurrence of periodontitis(P<0.05).The random forest model showed that the most influential variable was HbA1c followed by age,TC,TG, education level, brushing frequency, and sex. Comparison of the prediction effects of the two models showedthat in the training dataset, the AUC of the random forest model was higher than that of the logistic regressionmodel (AUC = 1.000 vs AUC = 0.851;P < 0.05). In the validation dataset, there was no significant difference in AUCbetween the random forest and logistic regression models (AUC = 0.946 vs AUC = 0.915;P > 0.05).CONCLUSION Both random forest and logistic regression models have good predictive value and can accurately predict the riskof periodontitis in patients with T2DM.展开更多
Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of meth...Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of methods,but most of these methods only use the time domain information of traffic flow data to predict the traffic flow,ignoring the impact of spatial correlation on the prediction of target road segment flow,which leads to poor prediction accuracy.In this paper,a traffic flow prediction model called as long short time memory and random forest(LSTM-RF)was proposed based on the combination model.In the process of traffic flow prediction,the long short time memory(LSTM)model was used to extract the time sequence features of the predicted target road segment.Then,the predicted value of LSTM and the collected information of adjacent upstream and downstream sections were simultaneously used as the input features of the random forest model to analyze the spatial-temporal correlation of traffic flow,so as to obtain the final prediction results.The traffic flow data of 132 urban road sections collected by the license plate recognition system in Guiyang City were tested and verified.The results show that the method is better than the single model in prediction accuracy,and the prediction error is obviously reduced compared with the single model.展开更多
The dead fuel moisture content(DFMC)is the key driver leading to fire occurrence.Accurately estimating the DFMC could help identify locations facing fire risks,prioritise areas for fire monitoring,and facilitate timel...The dead fuel moisture content(DFMC)is the key driver leading to fire occurrence.Accurately estimating the DFMC could help identify locations facing fire risks,prioritise areas for fire monitoring,and facilitate timely deployment of fire-suppression resources.In this study,the DFMC and environmental variables,including air temperature,relative humidity,wind speed,solar radiation,rainfall,atmospheric pressure,soil temperature,and soil humidity,were simultaneously measured in a grassland of Ergun City,Inner Mongolia Autonomous Region of China in 2021.We chose three regression models,i.e.,random forest(RF)model,extreme gradient boosting(XGB)model,and boosted regression tree(BRT)model,to model the seasonal DFMC according to the data collected.To ensure accuracy,we added time-lag variables of 3 d to the models.The results showed that the RF model had the best fitting effect with an R2value of 0.847 and a prediction accuracy with a mean absolute error score of 4.764%among the three models.The accuracies of the models in spring and autumn were higher than those in the other two seasons.In addition,different seasons had different key influencing factors,and the degree of influence of these factors on the DFMC changed with time lags.Moreover,time-lag variables within 44 h clearly improved the fitting effect and prediction accuracy,indicating that environmental conditions within approximately 48 h greatly influence the DFMC.This study highlights the importance of considering 48 h time-lagged variables when predicting the DFMC of grassland fuels and mapping grassland fire risks based on the DFMC to help locate high-priority areas for grassland fire monitoring and prevention.展开更多
In a recent paper,Hong et al developed an artificial intelligence(AI)-driven predictive scoring system for potential complications following laparoscopic radical gastrectomy for gastric cancer patients.They demonstrat...In a recent paper,Hong et al developed an artificial intelligence(AI)-driven predictive scoring system for potential complications following laparoscopic radical gastrectomy for gastric cancer patients.They demonstrated that integrating AI with random forest models significantly improved the preoperative prediction and patient outcome management accuracy.By incorporating data from multiple centers,their model ensures standardization,reliability,and broad applicability,distinguishing it from the prior models.The present study highlights AI's potential in clinical decision support,aiding in the preoperative and postoperative management of gastric cancer patients.Our findings may pave the way for future prospective studies to further enhance AI-supported diagnoses in clinical practice.展开更多
Critical zone(CZ)plays a vital role in sustaining biodiversity and humanity.However,flux quantification within CZ,particularly in terms of subsurface hydrological partitioning,remains a significant challenge.This stud...Critical zone(CZ)plays a vital role in sustaining biodiversity and humanity.However,flux quantification within CZ,particularly in terms of subsurface hydrological partitioning,remains a significant challenge.This study focused on quantifying subsurface hydrological partitioning,specifically in an alpine mountainous area,and highlighted the important role of lateral flow during this process.Precipitation was usually classified as two parts into the soil:increased soil water content(SWC)and lateral flow out of the soil pit.It was found that 65%–88%precipitation contributed to lateral flow.The second common partitioning class showed an increase in SWC caused by both precipitation and lateral flow into the soil pit.In this case,lateral flow contributed to the SWC increase ranging from 43%to 74%,which was notably larger than the SWC increase caused by precipitation.On alpine meadows,lateral flow from the soil pit occurred when the shallow soil was wetter than the field capacity.This result highlighted the need for three-dimensional simulation between soil layers in Earth system models(ESMs).During evapotranspiration process,significant differences were observed in the classification of subsurface hydrological partitioning among different vegetation types.Due to tangled and aggregated fine roots in the surface soil on alpine meadows,the majority of subsurface responses involved lateral flow,which provided 98%–100%of evapotranspiration(ET).On grassland,there was a high probability(0.87),which ET was entirely provided by lateral flow.The main reason for underestimating transpiration through soil water dynamics in previous research was the neglect of lateral root water uptake.Furthermore,there was a probability of 0.12,which ET was entirely provided by SWC decrease on grassland.In this case,there was a high probability(0.98)that soil water responses only occurred at layer 2(10–20 cm),because grass roots mainly distributed in this soil layer,and grasses often used their deep roots for water uptake during ET.To improve the estimation of soil water dynamics and ET,we established a random forest(RF)model to simulate lateral flow and then corrected the community land model(CLM).RF model demonstrated good performance and led to significant improvements in CLM simulation.These findings enhance our understanding of subsurface hydrological partitioning and emphasize the importance of considering lateral flow in ESMs and hydrological research.展开更多
Survival rates following radical surgery for gastric neuroendocrine neoplasms(g-NENs)are low,with high recurrence rates.This fact impacts patient prognosis and complicates postoperative management.Traditional prognost...Survival rates following radical surgery for gastric neuroendocrine neoplasms(g-NENs)are low,with high recurrence rates.This fact impacts patient prognosis and complicates postoperative management.Traditional prognostic models,including the Cox proportional hazards(CoxPH)model,have shown limited predictive power for postoperative survival in gastrointestinal neuroectodermal tumor patients.Machine learning methods offer a unique opportunity to analyze complex relationships within datasets,providing tools and methodologies to assess large volumes of high-dimensional,multimodal data generated by biological sciences.These methods show promise in predicting outcomes across various medical disciplines.In the context of g-NENs,utilizing machine learning to predict survival outcomes holds potential for personalized postoperative management strategies.This editorial reviews a study exploring the advantages and effectiveness of the random survival forest(RSF)model,using the lymph node ratio(LNR),in predicting disease-specific survival(DSS)in postoperative g-NEN patients stratified into low-risk and high-risk groups.The findings demonstrate that the RSF model,incorporating LNR,outperformed the CoxPH model in predicting DSS and constitutes an important step towards precision medicine.展开更多
In this study,multi-source remote sensing data and machine learning algo-rithms were used to delineate the prospect area of remote sensing geological prospecting in eastern Botswana.Landsat 8 remote sensing images wer...In this study,multi-source remote sensing data and machine learning algo-rithms were used to delineate the prospect area of remote sensing geological prospecting in eastern Botswana.Landsat 8 remote sensing images were used to produce iron stain and hydroxyl anomaly maps,ASTER remote sensing images were used to extract chalcopyrite mineral distribution maps,and Mi-crosoft high-resolution remote sensing data were used to extract lithology and structure maps to comprehensively analyze regional metallogenic infor-mation.Then,the random forest,classification regression tree(CART)and gradient Lift Tree(GBT)classification algorithms were used to compare the models.The results showed that the random forest algorithm had the best per-formance in identifying mineralization potential areas,and its accuracy reached 0.95.Finally,the remote sensing geological prospect area of eastern Botswana was delineated based on random forest algorithm,which provided important technical support for mineral resource exploration in this area.This study shows that the combination of multi-source remote sensing data and efficient classification algorithm has great potential in geological prospecting,and provides scientific methods and technical means for the follow-up remote sensing prospecting research.展开更多
Climate change influences both ecosystems and ecosystem services.The impacts of climate change on ecosystems and ecosystem services have been separately documented.However,it is less well known how ecosystem changes d...Climate change influences both ecosystems and ecosystem services.The impacts of climate change on ecosystems and ecosystem services have been separately documented.However,it is less well known how ecosystem changes driven by climate change will influence ecosystem services,especially in climate-sensitive regions.Here,we analyzed future climate trends between 2040 and 2100 under four Shared Socioeconomic Pathway(SSP) scenarios(SSP1-2.6,SSP2-4.5,SSP3-7.0,and SSP5-8.5) from the Coupled Model Intercomparison Project 6(CMIP6).We quantified their impacts on ecosystems patterns and on the ecosystem service of sandstorm prevention on the Qinghai-Tibet Plateau(QTP),one of the most climate-sensitive regions in the world,using Random Forest model(RF) and Revised Wind Erosion Equation(RWEQ).Strong warming(0.04℃/yr) and wetting(0.65 mm/yr) trends were projected from 2015 to 2100.Under these trends,there will be increased interspersion in the pattern of grassland and sparse vegetation with meadow and swamp vegetation,although their overall area will remain similar,while the areas of shrub and needleleaved forest classes will increase and move toward higher altitudes.Driven by the changes in ecosystem patterns caused by climate change indirectly,grassland will play an irreplaceable role in providing sandstorm prevention services,and sandstorm prevention services will increase gradually from 2040 to 2100(1.059-1.070 billion tons) on the QTP.However,some areas show a risk of deterioration in the future and these should be the focus of ecological rehabilitation.Our research helps to understand the cascading relationship among climate change,ecosystem patterns and ecosystem services,which provides important spatio-temporal information for future ecosystem service management.展开更多
基金funded by Institutional Fund Projects under grant no.(IFPDP-261-22)。
文摘Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support vector machine(SVM),as well as ensemble methods,such as Gradient Boosting and eXtreme gradient boosting(XGBoost),are often plagued by high computational costs,which makes it challenging for them to perform real-time detection.In this regard,we suggested an attack detection approach that integrates Visual Geometry Group 16(VGG16),Artificial Rabbits Optimizer(ARO),and Random Forest Model to increase detection accuracy and operational efficiency in Internet of Things(IoT)networks.In the suggested model,the extraction of features from malware pictures was accomplished with the help of VGG16.The prediction process is carried out by the random forest model using the extracted features from the VGG16.Additionally,ARO is used to improve the hyper-parameters of the random forest model of the random forest.With an accuracy of 96.36%,the suggested model outperforms the standard models in terms of accuracy,F1-score,precision,and recall.The comparative research highlights our strategy’s success,which improves performance while maintaining a lower computational cost.This method is ideal for real-time applications,but it is effective.
基金National Natural Science Foundation of China,No.42071167,No.42201197,No.40871073The Second Tibetan Plateau Scientific Expedition and Research Program,No.2019QZKK0406Natural Science Foundation of Hebei Province,No.D2007000272。
文摘Random forest model is the mainstream research method used to accurately describe the distribution law and impact mechanism of regional population.We took Shijiazhuang as the research area,with comprehensive zoning based on endowments as the modeling unit,conducted stratified sampling on a hectare grid cell,and systematically carried out incremental selection experiments of population density impact factors,optimizing the population density random forest model throughout the process(zonal modeling,stratified sampling,factor selection,weighted output).The results are as follows:(1)Zonal modeling addresses the issue of confusion in population distribution laws caused by a single model.Sampling on a grid cell not only ensures the quality of training data by avoiding the modifiable areal unit problem(MAUP)but also attempts to mitigate the adverse effects of the ecological fallacy.Stratified sampling ensures the stability of population density label values(target variable)in the training sample.(2)Zonal selection experiments on population density impact factors help identify suitable combinations of factors,leading to a significant improvement in the goodness of fit(R^(2))of the zonal models.(3)Weighted combination output of the population density prediction dataset substantially enhances the model's robustness.(4)The population density dataset exhibits multi-scale superposition characteristics.On a large scale,the population density in plains is higher than that in mountainous areas,while on a small scale,urban areas have higher density compared to rural areas.The optimization scheme for the population density random forest model that we propose offers a unified technical framework for uncovering local population distribution law and the impact mechanisms.
文摘To improve the efficiency of air quality analysis and the accuracy of predictions, this paper proposes a composite method based on Vector Autoregressive (VAR) and Random Forest (RF) models. In the theoretical section, the model introduction and estimation algorithms are provided. In the empirical analysis section, global air quality data from 2022 to 2024 are used, and the proposed method is applied. Specifically, principal component analysis (PCA) is first conducted, and then VAR and Random Forest methods are used for prediction on the reduced-dimensional data. The results show that the RMSE of the hybrid model is 45.27, significantly lower than the 49.11 of the VAR model alone, verifying its superiority. The stability and predictive performance of the model are effectively enhanced.
文摘Potential of the Random Forest Model on mapping of different desertification processes was studied in Muttuma watershed of mid-Murrumbidgee river region of New South Wales,Australia.Desertification vulnerability index was developed using climate,terrain,vegetation,soil and land quality indices to identify environmentally sensitive areas for desertification.Random Forest Model(RFM)was used to predict the different desertification processes such as soil erosion,salinization and waterlogging in the watershed and the information needed to train classification algorithms was obtained from satellite imagery interpretation and ground truth data.Climatic factors(evaporation,rainfall,temperature),terrain factors(aspect,slope,slope length,steepness,and wetness index),soil properties(pH,organic carbon,clay and sand content)and vulnerability indices were used as an explanatory variable.Classification accuracy and kappa index were calculated for training and testing datasets.We recorded an overall accuracy rate of 87.7%and 72.1%for training and testing sites,respectively.We found larger discrepancies between overall accuracy rate and kappa index for testing datasets(72.2%and 27.5%,respectively)suggesting that all the classes are not predicted well.The prediction of soil erosion and no desertification process was good and poor for salinization and water-logging process.Overall,the results observed give a new idea of using the knowledge of desertification process in training areas that can be used to predict the desertification processes at unvisited areas.
基金supported by the grants from the Natural Science Foundation of Hubei Province(No.2020CFB780)the Fundamental Research Funds for the Central Universities(No.2017KFYXJJ020).
文摘Objective Body fluid mixtures are complex biological samples that frequently occur in crime scenes,and can provide important clues for criminal case analysis.DNA methylation assay has been applied in the identification of human body fluids,and has exhibited excellent performance in predicting single-source body fluids.The present study aims to develop a methylation SNaPshot multiplex system for body fluid identification,and accurately predict the mixture samples.In addition,the value of DNA methylation in the prediction of body fluid mixtures was further explored.Methods In the present study,420 samples of body fluid mixtures and 250 samples of single body fluids were tested using an optimized multiplex methylation system.Each kind of body fluid sample presented the specific methylation profiles of the 10 markers.Results Significant differences in methylation levels were observed between the mixtures and single body fluids.For all kinds of mixtures,the Spearman’s correlation analysis revealed a significantly strong correlation between the methylation levels and component proportions(1:20,1:10,1:5,1:1,5:1,10:1 and 20:1).Two random forest classification models were trained for the prediction of mixture types and the prediction of the mixture proportion of 2 components,based on the methylation levels of 10 markers.For the mixture prediction,Model-1 presented outstanding prediction accuracy,which reached up to 99.3%in 427 training samples,and had a remarkable accuracy of 100%in 243 independent test samples.For the mixture proportion prediction,Model-2 demonstrated an excellent accuracy of 98.8%in 252 training samples,and 98.2%in 168 independent test samples.The total prediction accuracy reached 99.3%for body fluid mixtures and 98.6%for the mixture proportions.Conclusion These results indicate the excellent capability and powerful value of the multiplex methylation system in the identification of forensic body fluid mixtures.
文摘This study investigates the application of machine learning models to address after-sales service issues in cross-border e-commerce,focusing on predicting order returns to reduce return costs and optimize customer experience.Using H cross-border e-commerce company as a case study,the research employs Random Forest and XGBoost models to identify high-risk return orders.By comparing the performance of these two models,the study highlights their respective strengths and weaknesses and proposes optimization strategies.The findings provide a valuable reference for e-commerce companies to refine their business models,reduce return rates,improve operational efficiency,and enhance customer satisfaction.
基金funded by the Tianshan Yingcai Program of the Xinjiang Uygur Autonomous Region(2022TSYCCX0038)the Youth Innovation Promotion Association of the Chinese Academy of Sciences(Y2022108)the Postdoctoral Fellowship Program of Chinese Postdoctoral Science Foundation(CPSF)(GZC20232962).
文摘The Tarim River Basin(TRB)is a vast area with plenty of light and heat and is an important base for grain and cotton production in Northwest China.In the context of climate change,however,the increased frequency of extreme weather and climate events is having numerous negative impacts on the region's agricultural production.To better understand how unfavorable climatic conditions affect crop production,we explored the relationship of extreme weather and climate events with crop yields and phenology.In this research,ten indicators of extreme weather and climate events(consecutive dry days(CDD),min Tmax(TXn),max Tmin(TNx),tropical nights(TR),warm days(Tx90p),warm nights(Tn90p),summer days(SU),frost days(FD),very wet days(R95p),and windy days(WD))were selected to analyze the impact of spatial and temporal variations on the yields of major crops(wheat,maize,and cotton)in the TRB from 1990 to 2020.The three key findings of this research were as follows:extreme temperatures in southwestern TRB showed an increasing trend,with higher extreme temperatures at night,while the occurrence of extreme weather and climate events in northeastern TRB was relatively low.The number of FD was on the rise,while WD also increased in recent years.Crop yields were higher in the northeast compared with the southwest,and wheat,maize,and cotton yields generally showed an increasing trend despite an earlier decline.The correlation of extreme weather and climate events on crop yields can be categorized as extreme nighttime temperature indices(TNx,Tn90p,TR,and FD),extreme daytime temperature indices(TXn,Tx90p,and SU),extreme precipitation indices(CDD and R95p),and extreme wind(WD).By using Random Forest(RF)approach to determine the effects of different extreme weather and climate events on the yields of different crops,we found that the importance of extreme precipitation indices(CDD and R95p)to crop yield decreased significantly over time.As well,we found that the importance of the extreme nighttime temperature(TR and TNx)for the yields of the three crops increased during 2005-2020 compared with 1990-2005.The impact of extreme temperature events on wheat,maize,and cotton yields in the TRB is becoming increasingly significant,and this finding can inform policy decisions and agronomic innovations to better cope with current and future climate warming.
文摘Understanding how environmental adaptation varies among families within a species is critical to adapt forestry activities such as management and breeding to possible future climate change.The present study examined home-site advantage and local advantage in growth and basic density of wood in 36 families of Chamaecyparis obtuse(Siebold et Zucc.)Endl.,reciprocally planted at two progeny test sites with differing climatic conditions in Japan.A significant home-site advantage for growth was detected between the lowland and mountainous regions within the Kanto breeding region.In addition,the effects of climate differentials between the selection site of mating parents and the progeny test site on growth and basic density were inves-tigated.As a result,temperature was identified as the most significant climatic factor attributed to local adaptation for growth traits.Elongation and radial growth were adversely influenced when the progeny test site temperature exceeded the provenance temperature by more than 2°C.Therefore,it is crucial to account for temperature differences between the provenance and the planting site to adapt afforestation and forest tree breeding to climate change in the future.
基金funded by the First-Class Discipline Research Special Project of Inner Mongolia(YLXKZX-NSD-040)the Natural Science Foundation of Inner Mongolia(2022LHQN04003,2023QN04009)+1 种基金the Fundamental Research Funds for the Inner Mongolia University of Finance and Economics(NCXKY25019,NCYWZ22003)the National Social Science Fund of China(22BZS134).
文摘Fires are one of the most destructive natural disasters and have serious long-term effects on the environment,economy,and human health.In Inner Mongolia Autonomous Region,China,frequent fire disturbance occurs due to the intensification of climate change and human activities.It is crucial to understand the fire regime and estimate the probability of regional fire occurrence and reducing fire losses.However,most studies have primarily focused on the dynamic changes,probability of occurrence,and driving mechanisms of wildfires in the grassland and forest land ecosystems in Inner Mongolia,while insufficient research has been conducted on the spatiotemporal variations in active fires and their impact on the wildfire risk in forest land and grassland.Therefore,in this study,we analyzed the active fire regime based on Moderate Resolution Imaging Spectroradiometer(MODIS)thermal anomalies and burned area products from 2000 to 2022.Combined with climate,topographic,landscape,anthropogenic,and vegetation datasets,logistic regression(LR),support vector machine(SVM),random forest(RF),and convolutional neural network(CNN)models were chosen to estimate the probability of active fire occurrence at the seasonal timescale.The results revealed that:(1)a total of 100,343 active fires occurred in Inner Mongolia and the burned area reached 6.59×104 km².The number of ignition point exhibited a significant increasing trend,while the burned area exhibited a nonsignificant decreasing trend;(2)four active fire belts were detected,namely,the Hetao-Tumochuan Plain fire belt,Xiliao River Plain fire belt,Songnen Plain fire belt,and Hailar River Eroded Plain fire belt.The centroid of the active fires has shifted 456.4 km toward the southwest;(3)RF model achieved the highest accuracy in estimating the probability of active fire occurrence,followed by CNN,and LR and SVM models had lower accuracies;and(4)the distribution of the high and extremely high fire risk areas largely aligned with the four fire belts.The probability of active fire occurrence was the highest in spring,followed by that in autumn,and it gradually decreased in summer and winter.Our results revealed active fires migrated to the southwest and ignition sources increased,despite reduction of the burned area was not significant.The RF model outperformed the other models in predicting the probability of active fire occurrence.These findings contribute to future fire prevention and prediction in Inner Mongolia.
基金Under the auspices of National Natural Science Foundation of China(No.42171407,42077242)Key Program of National Natural Science Foundation of China(No.42330607)。
文摘Recently,the outbreak and spread of larch caterpillar(Dendrolimus superans)pests have emerged as significant contributors to forest degradation in the Changbai Mountains,China.Understanding the spatiotemporal distribution patterns of these pests is crucial for effective management and protection of forest ecosystems.This study proposes a pest monitoring approach based on Sentinel imagery.Through time-series analysis,we extracted pest-sensitive features and developed a random forest classifier that integrated Sentinel-1,Sentinel-2,and field sampling data from 2019–2023 to monitor larch caterpillar pests in the Changbai Mountains National Nature Reserve(CMNNR),Northeast China.Our findings indicated that bands green(B3),near-infrared(B8),short wave infrared(B11 and B12)from Sentinel-2 remote sensing images exhibited notable discriminative capabilities for identifying larch caterpillar pests.Specifically,the Normalized Difference Vegetation Index(NDVI)at the end of the growing season emerged as the most valuable feature for pest extraction.Incorporating Synthetic Aperture Radar(SAR)features along with optical data marginally enhances model performance.Furthermore,our approach unveiled the outbreak of larch caterpillar pests,achieving classification map with overall accuracy exceeding 85%and Kappa coefficient surpassing 0.8 for five study years.The pest outbreak began in 2019 and progressively intensified over time.In September 2019,the affected area spanned 114.23 km^(2).The infested area exhibited a declining trend from 2020 to 2023.This study introduces a novel method for the high-precision identification of larch caterpillar pests,offering technical advancements and theoretical underpinnings to support forest management strategies.
基金This work was supported by the Taif University Researchers supporting Project Number(TURSP-2020/254).
文摘COVID-19,being the virus of fear and anxiety,is one of the most recent and emergent of various respiratory disorders.It is similar to the MERS-COV and SARS-COV,the viruses that affected a large population of different countries in the year 2012 and 2002,respectively.Various standard models have been used for COVID-19 epidemic prediction but they suffered from low accuracy due to lesser data availability and a high level of uncertainty.The proposed approach used a machine learning-based time-series Facebook NeuralProphet model for prediction of the number of death as well as confirmed cases and compared it with Poisson Distribution,and Random Forest Model.The analysis upon dataset has been performed considering the time duration from January 1st 2020 to16th July 2021.The model has been developed to obtain the forecast values till September 2021.This study aimed to determine the pandemic prediction of COVID-19 in the second wave of coronavirus in India using the latest Time-Series model to observe and predict the coronavirus pandemic situation across the country.In India,the cases are rapidly increasing day-by-day since mid of Feb 2021.The prediction of death rate using the proposed model has a good ability to forecast the COVID-19 dataset essentially in the second wave.To empower the prediction for future validation,the proposed model works effectively.
文摘BACKGROUND Gestational diabetes mellitus(GDM)is a condition characterized by high blood sugar levels during pregnancy.The prevalence of GDM is on the rise globally,and this trend is particularly evident in China,which has emerged as a significant issue impacting the well-being of expectant mothers and their fetuses.Identifying and addressing GDM in a timely manner is crucial for maintaining the health of both expectant mothers and their developing fetuses.Therefore,this study aims to establish a risk prediction model for GDM and explore the effects of serum ferritin,blood glucose,and body mass index(BMI)on the occurrence of GDM.AIM To develop a risk prediction model to analyze factors leading to GDM,and evaluate its efficiency for early prevention.METHODS The clinical data of 406 pregnant women who underwent routine prenatal examination in Fujian Maternity and Child Health Hospital from April 2020 to December 2022 were retrospectively analyzed.According to whether GDM occurred,they were divided into two groups to analyze the related factors affecting GDM.Then,according to the weight of the relevant risk factors,the training set and the verification set were divided at a ratio of 7:3.Subsequently,a risk prediction model was established using logistic regression and random forest models,and the model was evaluated and verified.RESULTS Pre-pregnancy BMI,previous history of GDM or macrosomia,hypertension,hemoglobin(Hb)level,triglyceride level,family history of diabetes,serum ferritin,and fasting blood glucose levels during early pregnancy were determined.These factors were found to have a significant impact on the development of GDM(P<0.05).According to the nomogram model’s prediction of GDM in pregnancy,the area under the curve(AUC)was determined to be 0.883[95%confidence interval(CI):0.846-0.921],and the sensitivity and specificity were 74.1%and 87.6%,respectively.The top five variables in the random forest model for predicting the occurrence of GDM were serum ferritin,fasting blood glucose in early pregnancy,pre-pregnancy BMI,Hb level and triglyceride level.The random forest model achieved an AUC of 0.950(95%CI:0.927-0.973),the sensitivity was 84.8%,and the specificity was 91.4%.The Delong test showed that the AUC value of the random forest model was higher than that of the decision tree model(P<0.05).CONCLUSION The random forest model is superior to the nomogram model in predicting the risk of GDM.This method is helpful for early diagnosis and appropriate intervention of GDM.
基金the First People’s Hospital of Wenling(approval No.KY-2023-2035-01).
文摘BACKGROUND Type 2 diabetes mellitus(T2DM)is associated with periodontitis.Currently,there are few studies proposing predictive models for periodontitis in patients with T2DM.AIM To determine the factors influencing periodontitis in patients with T2DM by constructing logistic regression and random forest models.METHODS In this a retrospective study,300 patients with T2DM who were hospitalized at the First People’s Hospital of Wenling from January 2022 to June 2022 were selected for inclusion,and their data were collected from hospital records.We used logistic regression to analyze factors associated with periodontitis in patients with T2DM,and random forest and logistic regression prediction models were established.The prediction efficiency of the models was compared using the area under the receiver operating characteristic curve(AUC).RESULTS Of 300 patients with T2DM,224 had periodontitis,with an incidence of 74.67%.Logistic regression analysis showed that age[odds ratio(OR)=1.047,95%confidence interval(CI):1.017-1.078],teeth brushing frequency(OR=4.303,95%CI:2.154-8.599),education level(OR=0.528,95%CI:0.348-0.800),glycosylated hemoglobin(HbA1c)(OR=2.545,95%CI:1.770-3.661),total cholesterol(TC)(OR=2.872,95%CI:1.725-4.781),and triglyceride(TG)(OR=3.306,95%CI:1.019-10.723)influenced the occurrence of periodontitis(P<0.05).The random forest model showed that the most influential variable was HbA1c followed by age,TC,TG, education level, brushing frequency, and sex. Comparison of the prediction effects of the two models showedthat in the training dataset, the AUC of the random forest model was higher than that of the logistic regressionmodel (AUC = 1.000 vs AUC = 0.851;P < 0.05). In the validation dataset, there was no significant difference in AUCbetween the random forest and logistic regression models (AUC = 0.946 vs AUC = 0.915;P > 0.05).CONCLUSION Both random forest and logistic regression models have good predictive value and can accurately predict the riskof periodontitis in patients with T2DM.
文摘Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of methods,but most of these methods only use the time domain information of traffic flow data to predict the traffic flow,ignoring the impact of spatial correlation on the prediction of target road segment flow,which leads to poor prediction accuracy.In this paper,a traffic flow prediction model called as long short time memory and random forest(LSTM-RF)was proposed based on the combination model.In the process of traffic flow prediction,the long short time memory(LSTM)model was used to extract the time sequence features of the predicted target road segment.Then,the predicted value of LSTM and the collected information of adjacent upstream and downstream sections were simultaneously used as the input features of the random forest model to analyze the spatial-temporal correlation of traffic flow,so as to obtain the final prediction results.The traffic flow data of 132 urban road sections collected by the license plate recognition system in Guiyang City were tested and verified.The results show that the method is better than the single model in prediction accuracy,and the prediction error is obviously reduced compared with the single model.
基金funded by the National Key Research and Development Program of China Strategic International Cooperation in Science and Technology Innovation Program (2018YFE0207800)the National Natural Science Foundation of China (31971483)。
文摘The dead fuel moisture content(DFMC)is the key driver leading to fire occurrence.Accurately estimating the DFMC could help identify locations facing fire risks,prioritise areas for fire monitoring,and facilitate timely deployment of fire-suppression resources.In this study,the DFMC and environmental variables,including air temperature,relative humidity,wind speed,solar radiation,rainfall,atmospheric pressure,soil temperature,and soil humidity,were simultaneously measured in a grassland of Ergun City,Inner Mongolia Autonomous Region of China in 2021.We chose three regression models,i.e.,random forest(RF)model,extreme gradient boosting(XGB)model,and boosted regression tree(BRT)model,to model the seasonal DFMC according to the data collected.To ensure accuracy,we added time-lag variables of 3 d to the models.The results showed that the RF model had the best fitting effect with an R2value of 0.847 and a prediction accuracy with a mean absolute error score of 4.764%among the three models.The accuracies of the models in spring and autumn were higher than those in the other two seasons.In addition,different seasons had different key influencing factors,and the degree of influence of these factors on the DFMC changed with time lags.Moreover,time-lag variables within 44 h clearly improved the fitting effect and prediction accuracy,indicating that environmental conditions within approximately 48 h greatly influence the DFMC.This study highlights the importance of considering 48 h time-lagged variables when predicting the DFMC of grassland fuels and mapping grassland fire risks based on the DFMC to help locate high-priority areas for grassland fire monitoring and prevention.
文摘In a recent paper,Hong et al developed an artificial intelligence(AI)-driven predictive scoring system for potential complications following laparoscopic radical gastrectomy for gastric cancer patients.They demonstrated that integrating AI with random forest models significantly improved the preoperative prediction and patient outcome management accuracy.By incorporating data from multiple centers,their model ensures standardization,reliability,and broad applicability,distinguishing it from the prior models.The present study highlights AI's potential in clinical decision support,aiding in the preoperative and postoperative management of gastric cancer patients.Our findings may pave the way for future prospective studies to further enhance AI-supported diagnoses in clinical practice.
基金funded by the National Natural Science Foundation of China(42371022,42030501,41877148).
文摘Critical zone(CZ)plays a vital role in sustaining biodiversity and humanity.However,flux quantification within CZ,particularly in terms of subsurface hydrological partitioning,remains a significant challenge.This study focused on quantifying subsurface hydrological partitioning,specifically in an alpine mountainous area,and highlighted the important role of lateral flow during this process.Precipitation was usually classified as two parts into the soil:increased soil water content(SWC)and lateral flow out of the soil pit.It was found that 65%–88%precipitation contributed to lateral flow.The second common partitioning class showed an increase in SWC caused by both precipitation and lateral flow into the soil pit.In this case,lateral flow contributed to the SWC increase ranging from 43%to 74%,which was notably larger than the SWC increase caused by precipitation.On alpine meadows,lateral flow from the soil pit occurred when the shallow soil was wetter than the field capacity.This result highlighted the need for three-dimensional simulation between soil layers in Earth system models(ESMs).During evapotranspiration process,significant differences were observed in the classification of subsurface hydrological partitioning among different vegetation types.Due to tangled and aggregated fine roots in the surface soil on alpine meadows,the majority of subsurface responses involved lateral flow,which provided 98%–100%of evapotranspiration(ET).On grassland,there was a high probability(0.87),which ET was entirely provided by lateral flow.The main reason for underestimating transpiration through soil water dynamics in previous research was the neglect of lateral root water uptake.Furthermore,there was a probability of 0.12,which ET was entirely provided by SWC decrease on grassland.In this case,there was a high probability(0.98)that soil water responses only occurred at layer 2(10–20 cm),because grass roots mainly distributed in this soil layer,and grasses often used their deep roots for water uptake during ET.To improve the estimation of soil water dynamics and ET,we established a random forest(RF)model to simulate lateral flow and then corrected the community land model(CLM).RF model demonstrated good performance and led to significant improvements in CLM simulation.These findings enhance our understanding of subsurface hydrological partitioning and emphasize the importance of considering lateral flow in ESMs and hydrological research.
文摘Survival rates following radical surgery for gastric neuroendocrine neoplasms(g-NENs)are low,with high recurrence rates.This fact impacts patient prognosis and complicates postoperative management.Traditional prognostic models,including the Cox proportional hazards(CoxPH)model,have shown limited predictive power for postoperative survival in gastrointestinal neuroectodermal tumor patients.Machine learning methods offer a unique opportunity to analyze complex relationships within datasets,providing tools and methodologies to assess large volumes of high-dimensional,multimodal data generated by biological sciences.These methods show promise in predicting outcomes across various medical disciplines.In the context of g-NENs,utilizing machine learning to predict survival outcomes holds potential for personalized postoperative management strategies.This editorial reviews a study exploring the advantages and effectiveness of the random survival forest(RSF)model,using the lymph node ratio(LNR),in predicting disease-specific survival(DSS)in postoperative g-NEN patients stratified into low-risk and high-risk groups.The findings demonstrate that the RSF model,incorporating LNR,outperformed the CoxPH model in predicting DSS and constitutes an important step towards precision medicine.
文摘In this study,multi-source remote sensing data and machine learning algo-rithms were used to delineate the prospect area of remote sensing geological prospecting in eastern Botswana.Landsat 8 remote sensing images were used to produce iron stain and hydroxyl anomaly maps,ASTER remote sensing images were used to extract chalcopyrite mineral distribution maps,and Mi-crosoft high-resolution remote sensing data were used to extract lithology and structure maps to comprehensively analyze regional metallogenic infor-mation.Then,the random forest,classification regression tree(CART)and gradient Lift Tree(GBT)classification algorithms were used to compare the models.The results showed that the random forest algorithm had the best per-formance in identifying mineralization potential areas,and its accuracy reached 0.95.Finally,the remote sensing geological prospect area of eastern Botswana was delineated based on random forest algorithm,which provided important technical support for mineral resource exploration in this area.This study shows that the combination of multi-source remote sensing data and efficient classification algorithm has great potential in geological prospecting,and provides scientific methods and technical means for the follow-up remote sensing prospecting research.
基金supported by the Second Tibetan Plateau Scientific Expedition and Research Program (STEP) (Grant No.2019QZKK0307)。
文摘Climate change influences both ecosystems and ecosystem services.The impacts of climate change on ecosystems and ecosystem services have been separately documented.However,it is less well known how ecosystem changes driven by climate change will influence ecosystem services,especially in climate-sensitive regions.Here,we analyzed future climate trends between 2040 and 2100 under four Shared Socioeconomic Pathway(SSP) scenarios(SSP1-2.6,SSP2-4.5,SSP3-7.0,and SSP5-8.5) from the Coupled Model Intercomparison Project 6(CMIP6).We quantified their impacts on ecosystems patterns and on the ecosystem service of sandstorm prevention on the Qinghai-Tibet Plateau(QTP),one of the most climate-sensitive regions in the world,using Random Forest model(RF) and Revised Wind Erosion Equation(RWEQ).Strong warming(0.04℃/yr) and wetting(0.65 mm/yr) trends were projected from 2015 to 2100.Under these trends,there will be increased interspersion in the pattern of grassland and sparse vegetation with meadow and swamp vegetation,although their overall area will remain similar,while the areas of shrub and needleleaved forest classes will increase and move toward higher altitudes.Driven by the changes in ecosystem patterns caused by climate change indirectly,grassland will play an irreplaceable role in providing sandstorm prevention services,and sandstorm prevention services will increase gradually from 2040 to 2100(1.059-1.070 billion tons) on the QTP.However,some areas show a risk of deterioration in the future and these should be the focus of ecological rehabilitation.Our research helps to understand the cascading relationship among climate change,ecosystem patterns and ecosystem services,which provides important spatio-temporal information for future ecosystem service management.