The proliferation of robot accounts on social media platforms has posed a significant negative impact,necessitating robust measures to counter network anomalies and safeguard content integrity.Social robot detection h...The proliferation of robot accounts on social media platforms has posed a significant negative impact,necessitating robust measures to counter network anomalies and safeguard content integrity.Social robot detection has emerged as a pivotal yet intricate task,aimed at mitigating the dissemination of misleading information.While graphbased approaches have attained remarkable performance in this realm,they grapple with a fundamental limitation:the homogeneity assumption in graph convolution allows social robots to stealthily evade detection by mingling with genuine human profiles.To unravel this challenge and thwart the camouflage tactics,this work proposed an innovative social robot detection framework based on enhanced HOmogeneity and Random Forest(HORFBot).At the core of HORFBot lies a homogeneous graph enhancement strategy,intricately woven with edge-removal techniques,tometiculously dissect the graph intomultiple revealing subgraphs.Subsequently,leveraging the power of contrastive learning,the proposed methodology meticulously trains multiple graph convolutional networks,each honed to discern nuances within these tailored subgraphs.The culminating stage involves the fusion of these feature-rich base classifiers,harmoniously aggregating their insights to produce a comprehensive detection outcome.Extensive experiments on three social robot detection datasets have shown that this method effectively improves the accuracy of social robot detection and outperforms comparative methods.展开更多
Evaluation of water richness in sandstone is an important research topic in the prevention and control of mine water disasters,and the water richness in sandstone is closely related to its porosity.The refl ection sei...Evaluation of water richness in sandstone is an important research topic in the prevention and control of mine water disasters,and the water richness in sandstone is closely related to its porosity.The refl ection seismic exploration data have high-density spatial sampling information,which provides an important data basis for the prediction of sandstone porosity in coal seam roofs by using refl ection seismic data.First,the basic principles of the variational mode decomposition(VMD)method and the random forest method are introduced.Then,the geological model of coal seam roof sandstone is constructed,seismic forward modeling is conducted,and random noise is added.The decomposition eff ects of the empirical mode decomposition(EMD)method and VMD method on noisy signals are compared and analyzed.The test results show that the firstorder intrinsic mode functions(IMF1)and IMF2 decomposed by the VMD method contain the main eff ective components of seismic signals.A prediction process of sandstone porosity in coal seam roofs based on the combination of VMD and random forest method is proposed.The feasibility and eff ectiveness of the method are verified by trial calculation in the porosity prediction of model data.Taking the actual coalfield refl ection seismic data as an example,the sandstone porosity of the 8 coal seam roof is predicted.The application results show the potential application value of the new porosity prediction method proposed in this study.This method has important theoretical guiding significance for evaluating water richness in coal seam roof sandstone and the prevention and control of mine water disasters.展开更多
To improve the efficiency of air quality analysis and the accuracy of predictions, this paper proposes a composite method based on Vector Autoregressive (VAR) and Random Forest (RF) models. In the theoretical section,...To improve the efficiency of air quality analysis and the accuracy of predictions, this paper proposes a composite method based on Vector Autoregressive (VAR) and Random Forest (RF) models. In the theoretical section, the model introduction and estimation algorithms are provided. In the empirical analysis section, global air quality data from 2022 to 2024 are used, and the proposed method is applied. Specifically, principal component analysis (PCA) is first conducted, and then VAR and Random Forest methods are used for prediction on the reduced-dimensional data. The results show that the RMSE of the hybrid model is 45.27, significantly lower than the 49.11 of the VAR model alone, verifying its superiority. The stability and predictive performance of the model are effectively enhanced.展开更多
Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support v...Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support vector machine(SVM),as well as ensemble methods,such as Gradient Boosting and eXtreme gradient boosting(XGBoost),are often plagued by high computational costs,which makes it challenging for them to perform real-time detection.In this regard,we suggested an attack detection approach that integrates Visual Geometry Group 16(VGG16),Artificial Rabbits Optimizer(ARO),and Random Forest Model to increase detection accuracy and operational efficiency in Internet of Things(IoT)networks.In the suggested model,the extraction of features from malware pictures was accomplished with the help of VGG16.The prediction process is carried out by the random forest model using the extracted features from the VGG16.Additionally,ARO is used to improve the hyper-parameters of the random forest model of the random forest.With an accuracy of 96.36%,the suggested model outperforms the standard models in terms of accuracy,F1-score,precision,and recall.The comparative research highlights our strategy’s success,which improves performance while maintaining a lower computational cost.This method is ideal for real-time applications,but it is effective.展开更多
This paper explores the synergistic effect of a model combining Elastic Net and Random Forest in online fraud detection.The study selects a public network dataset containing 1781 data records,divides the dataset by 70...This paper explores the synergistic effect of a model combining Elastic Net and Random Forest in online fraud detection.The study selects a public network dataset containing 1781 data records,divides the dataset by 70%for training and 30%for validation,and analyses the correlation between features using a correlation matrix.The experimental results show that the Elastic Net feature selection method generally outperforms PCA in all models,especially when combined with the Random Forest and XGBoost models,and the ElasticNet+Random Forest model achieves the highest accuracy of 0.968 and AUC value of 0.983,while the Kappa and MCC also reached 0.839 and 0.844 respectively,showing extremely high consistency and correlation.This indicates that combining Elastic Net feature selection and Random Forest model has significant performance advantages in online fraud detection.展开更多
The Darjeeling Himalayan region,characterized by its complex topography and vulnerability to multiple environmental hazards,faces significant challenges including landslides,earthquakes,flash floods,and soil loss that...The Darjeeling Himalayan region,characterized by its complex topography and vulnerability to multiple environmental hazards,faces significant challenges including landslides,earthquakes,flash floods,and soil loss that critically threaten ecosystem stability.Among these challenges,soil erosion emerges as a silent disaster-a gradual yet relentless process whose impacts accumulate over time,progressively degrading landscape integrity and disrupting ecological sustainability.Unlike catastrophic events with immediate visibility,soil erosion’s most devastating consequences often manifest decades later through diminished agricultural productivity,habitat fragmentation,and irreversible biodiversity loss.This study developed a scalable predictive framework employing Random Forest(RF)and Gradient Boosting Tree(GBT)machine learning models to assess and map soil erosion susceptibility across the region.A comprehensive geo-database was developed incorporating 11 erosion triggering factors:slope,elevation,rainfall,drainage density,topographic wetness index,normalized difference vegetation index,curvature,soil texture,land use,geology,and aspect.A total of 2,483 historical soil erosion locations were identified and randomly divided into two sets:70%for model building and 30%for validation purposes.The models revealed distinct spatial patterns of erosion risks,with GBT classifying 60.50%of the area as very low susceptibility,while RF identified 28.92%in this category.Notable differences emerged in high-risk zone identification,with GBT highlighting 7.42%and RF indicating 2.21%as very high erosion susceptibility areas.Both models demonstrated robust predictive capabilities,with GBT achieving 80.77%accuracy and 0.975 AUC,slightly outperforming RF’s 79.67%accuracy and 0.972 AUC.Analysis of predictor variables identified elevation,slope,rainfall and NDVI as the primary factors influencing erosion susceptibility,highlighting the complex interrelationship between geo-environmental factors and erosion processes.This research offers a strategic framework for targeted conservation and sustainable land management in the fragile Himalayan region,providing valuable insights to help policymakers implement effective soil erosion mitigation strategies and support long-term environmental sustainability.展开更多
Zenith wet delay(ZWD)is a key parameter for the precise positioning of global navigation satellite systems(GNSS)and occupies a central role in meteorological research.Currently,most models only consider the periodic v...Zenith wet delay(ZWD)is a key parameter for the precise positioning of global navigation satellite systems(GNSS)and occupies a central role in meteorological research.Currently,most models only consider the periodic variability of the ZWD,neglecting the effect of nonlinear factors on the ZWD estimation.This oversight results in a limited capability to reflect the rapid fluctuations of the ZWD.To more accurately capture and predict complicated variations in ZWD,this paper developed the CRZWD model by a combination of the GPT3 model and random forests(RF)algorithm using 5-year atmospheric profiles from 70 radiosonde(RS)stations across China.Taking the external 25 test stations data as reference,the root mean square(RMS)of the CRZWD model is 29.95 mm.Compared with the GPT3 model and another model using backpropagation neural network(BPNN),the accuracy has improved by 24.7%and 15.9%,respectively.Notably,over 56%of the test stations exhibit an improvement of more than 20%in contrast to GPT3-ZWD.Further temporal and spatial characteristic analyses also demonstrate the significant accuracy and stability advantages of the CRZWD model,indicating the potential prospects for GNSS-based applications.展开更多
With the popularization of microgrid construction and the connection of renewable energy sources to the power system,the problem of source and load uncertainty faced by the coordinated operation of multi-microgrid is ...With the popularization of microgrid construction and the connection of renewable energy sources to the power system,the problem of source and load uncertainty faced by the coordinated operation of multi-microgrid is becoming increasingly prominent,and the accuracy of typical scenario predictions is low.In order to improve the accuracy of scenario prediction under source and load uncertainty,this paper proposes a typical scenario identification model based on random forests and order parameters.Firstly,a method for ordinal parameter identification and quantification is provided for the coordinated operating mode of multi-microgrids,taking into account source-load uncertainty.Secondly,the dynamic change characteristics of the order parameters of the daily load curve,wind and solar curve,and load curve of typical scenarios are statistically analyzed to identify the key order parameters that have the most significant impact on the uncertainty of the load.Then,the order parameters and seasonal distribution are used as features to train a random forest classification model to achieve efficient scenario prediction.Finally,the simulation of actual data from a provincial distribution network shows that the proposed method can accurately classify typical scenarios with an accuracy rate of 92.7%.Additionally,sensitivity analysis is conducted to assess how changes in uncertainty levels affect the importance of each order parameter,allowing for adaptive uncertainty mitigation strategies.展开更多
One of the core works of analyzing Electrochemical Impedance Spectroscopy(EIS)data is to select an appropriate equivalent circuit model to quantify the parameters of the electrochemical reaction process.However,this p...One of the core works of analyzing Electrochemical Impedance Spectroscopy(EIS)data is to select an appropriate equivalent circuit model to quantify the parameters of the electrochemical reaction process.However,this process often relies on human experience and judgment,which will introduce subjectivity and error.In this paper,an intelligent approach is proposed for matching EIS data to their equivalent circuits based on the Random Forest algorithm.It can automatically select the most suitable equivalent circuit model based on the characteristics and patterns of EIS data.Addressing the typical scenario of metal corrosion,an atmospheric corrosion EIS dataset of low-carbon steel is constructed in this paper,which includes five different corrosion scenarios.This dataset was used to validate and evaluate the pro-posed method in this paper.The contributions of this paper can be summarized in three aspects:(1)This paper proposes a method for selecting equivalent circuit models for EIS data based on the Random Forest algorithm.(2)Using authentic EIS data collected from metal atmospheric corrosion,the paper es-tablishes a dataset encompassing five categories of metal corrosion scenarios.(3)The superiority of the proposed method is validated through the utilization of the established authentic EIS dataset.The ex-periment results demonstrate that,in terms of equivalent circuit matching,this method surpasses other machine learning algorithms in both precision and robustness.Furthermore,it shows strong applicability in the analysis of EIS data.展开更多
A switch from avian-typeα-2,3 to human-typeα-2,6 receptors is an essential element for the initiation of a pandemic from an avian influenza virus.Some H9N2 viruses exhibit a preference for binding to human-typeα-2,...A switch from avian-typeα-2,3 to human-typeα-2,6 receptors is an essential element for the initiation of a pandemic from an avian influenza virus.Some H9N2 viruses exhibit a preference for binding to human-typeα-2,6 receptors.This identifies their potential threat to public health.However,our understanding of the molecular basis for the switch of receptor preference is still limited.In this study,we employed the random forest algorithm to identify the potentially key amino acid sites within hemagglutinin(HA),which are associated with the receptor binding ability of H9N2 avian influenza virus(AIV).Subsequently,these sites were further verified by receptor binding assays.A total of 12 substitutions in the HA protein(N158D,N158S,A160 N,A160D,A160T,T163I,T163V,V190T,V190A,D193 N,D193G,and N231D)were predicted to prefer binding toα-2,6 receptors.Except for the V190T substitution,the other substitutions were demonstrated to display an affinity for preferential binding toα-2,6 receptors by receptor binding assays.Especially,the A160T substitution caused a significant upregulation of immune-response genes and an increased mortality rate in mice.Our findings provide novel insights into understanding the genetic basis of receptor preference of the H9N2 AIV.展开更多
The agricultural Internet of Things(IoT)system is a critical component of modern smart agriculture,and its security risk assessment methods have garnered increasing attention from the industry.Current agricultural IoT...The agricultural Internet of Things(IoT)system is a critical component of modern smart agriculture,and its security risk assessment methods have garnered increasing attention from the industry.Current agricultural IoT security risk assessment methods primarily rely on expert judgment,introducing subjective factors that reduce the credibility of the assessment results.To address this issue,this study constructed a dataset for agricultural IoT security risk assessment based on real-world security reports.A PCARF algorithm,built on random forest principles,was proposed,incorporating ensemble learning strategies to enhance prediction accuracy.Compared to the second-best model,the proposed model demonstrated a 2.7%increase in accuracy,a 3.4%improvement in recall rate,a 3.1%rise in Area Under the Curve(AUC),and a 7.9%boost in Matthews Correlation Coefficient(MCC).Extensive comparative experiments showed that the proposed model outperforms others in prediction accuracy and robustness.展开更多
Accurate Electric Load Forecasting(ELF)is crucial for optimizing production capacity,improving operational efficiency,and managing energy resources effectively.Moreover,precise ELF contributes to a smaller environment...Accurate Electric Load Forecasting(ELF)is crucial for optimizing production capacity,improving operational efficiency,and managing energy resources effectively.Moreover,precise ELF contributes to a smaller environmental footprint by reducing the risks of disruption,downtime,and waste.However,with increasingly complex energy consumption patterns driven by renewable energy integration and changing consumer behaviors,no single approach has emerged as universally effective.In response,this research presents a hybrid modeling framework that combines the strengths of Random Forest(RF)and Autoregressive Integrated Moving Average(ARIMA)models,enhanced with advanced feature selection—Minimum Redundancy Maximum Relevancy and Maximum Synergy(MRMRMS)method—to produce a sparse model.Additionally,the residual patterns are analyzed to enhance forecast accuracy.High-resolution weather data from Weather Underground and historical energy consumption data from PJM for Duke Energy Ohio and Kentucky(DEO&K)are used in this application.This methodology,termed SP-RF-ARIMA,is evaluated against existing approaches;it demonstrates more than 40%reduction in mean absolute error and root mean square error compared to the second-best method.展开更多
The prediction of slope stability is a complex nonlinear problem.This paper proposes a new method based on the random forest(RF)algorithm to study the rocky slopes stability.Taking the Bukit Merah,Perak and Twin Peak(...The prediction of slope stability is a complex nonlinear problem.This paper proposes a new method based on the random forest(RF)algorithm to study the rocky slopes stability.Taking the Bukit Merah,Perak and Twin Peak(Kuala Lumpur)as the study area,the slope characteristics of geometrical parameters are obtained from a multidisciplinary approach(consisting of geological,geotechnical,and remote sensing analyses).18 factors,including rock strength,rock quality designation(RQD),joint spacing,continuity,openness,roughness,filling,weathering,water seepage,temperature,vegetation index,water index,and orientation,are selected to construct model input variables while the factor of safety(FOS)functions as an output.The area under the curve(AUC)value of the receiver operating characteristic(ROC)curve is obtained with precision and accuracy and used to analyse the predictive model ability.With a large training set and predicted parameters,an area under the ROC curve(the AUC)of 0.95 is achieved.A precision score of 0.88 is obtained,indicating that the model has a low false positive rate and correctly identifies a substantial number of true positives.The findings emphasise the importance of using a variety of terrain characteristics and different approaches to characterise the rock slope.展开更多
Real-time intelligent lithology identification while drilling is vital to realizing downhole closed-loop drilling. The complex and changeable geological environment in the drilling makes lithology identification face ...Real-time intelligent lithology identification while drilling is vital to realizing downhole closed-loop drilling. The complex and changeable geological environment in the drilling makes lithology identification face many challenges. This paper studies the problems of difficult feature information extraction,low precision of thin-layer identification and limited applicability of the model in intelligent lithologic identification. The author tries to improve the comprehensive performance of the lithology identification model from three aspects: data feature extraction, class balance, and model design. A new real-time intelligent lithology identification model of dynamic felling strategy weighted random forest algorithm(DFW-RF) is proposed. According to the feature selection results, gamma ray and 2 MHz phase resistivity are the logging while drilling(LWD) parameters that significantly influence lithology identification. The comprehensive performance of the DFW-RF lithology identification model has been verified in the application of 3 wells in different areas. By comparing the prediction results of five typical lithology identification algorithms, the DFW-RF model has a higher lithology identification accuracy rate and F1 score. This model improves the identification accuracy of thin-layer lithology and is effective and feasible in different geological environments. The DFW-RF model plays a truly efficient role in the realtime intelligent identification of lithologic information in closed-loop drilling and has greater applicability, which is worthy of being widely used in logging interpretation.展开更多
During construction,the shield linings of tunnels often face the problem of local or overall upward movement after leaving the shield tail in soft soil areas or during some large diameter shield projects.Differential ...During construction,the shield linings of tunnels often face the problem of local or overall upward movement after leaving the shield tail in soft soil areas or during some large diameter shield projects.Differential floating will increase the initial stress on the segments and bolts which is harmful to the service performance of the tunnel.In this study we used a random forest(RF)algorithm combined particle swarm optimization(PSO)and 5-fold cross-validation(5-fold CV)to predict the maximum upward displacement of tunnel linings induced by shield tunnel excavation.The mechanism and factors causing upward movement of the tunnel lining are comprehensively summarized.Twelve input variables were selected according to results from analysis of influencing factors.The prediction performance of two models,PSO-RF and RF(default)were compared.The Gini value was obtained to represent the relative importance of the influencing factors to the upward displacement of linings.The PSO-RF model successfully predicted the maximum upward displacement of the tunnel linings with a low error(mean absolute error(MAE)=4.04 mm,root mean square error(RMSE)=5.67 mm)and high correlation(R^(2)=0.915).The thrust and depth of the tunnel were the most important factors in the prediction model influencing the upward displacement of the tunnel linings.展开更多
OBJECTIVE: To investigate blood pressure rhythm(BPR)in Yin deficiency syndrome of hypertension(YDSH)patients and develop a random forest model for predicting YDSH.METHODS: Our study was consistent with technical proce...OBJECTIVE: To investigate blood pressure rhythm(BPR)in Yin deficiency syndrome of hypertension(YDSH)patients and develop a random forest model for predicting YDSH.METHODS: Our study was consistent with technical processes and specification for developing guidelines of Evidence-based Chinese medicine clinical practice(T/CACM 1032-2017). We enrolled 234 patients who had been diagnosed with primary hypertension without antihypertensive medications prior to the enrollment. All participants were divided into Yin deficiency group(YX, n = 74) and non-Yin deficiency group(NYX, n = 160).Participants were professionally grouped by three experienced chief Traditional Chinese Medicine(TCM)physicians according to four examinations(i.e.,inspection, listening and smelling, inquiry and palpation).We collected data on 24 h ambulatory blood pressure monitoring(ABPM) and YDSH rating scale. We divided 24 h of a day into 12 two-hour periods [Chen-Shi(7:00-9:00), Si-Shi(9:00-11:00), Wu-Shi(11:00-13:00), Wei-Shi(13:00-15:00), Shen-Shi(15:00-17:00), You-Shi(17:00-19:00), Xu-Shi(19:00-21:00), Hai-Shi(21:00-23:00), ZiShi(23:00-1:00), Chou-Shi(1:00-3:00), Yin-Shi(3:00-5:00), Mao-Shi(5:00-7:00)] according to the theory of “midnight-midday ebb flow”. We used random forest to build the diagnostic model of YDSH, with whether it was Yin deficiency syndrome as the outcome. RESULTS: Compared with NYX group, YX group had more female participants with older age, lower waist circumference, body mass index(BMI), diastolic blood pressure(DBP), and smoking and drinking rate(all P < 0.05). The YDSH rating scores of YX group [28.5(21.0-36.0)] were significantly higher than NYX group [13.0(8.0-22.0)](P < 0.001), and the typical symptoms of YX group included vexing heat in the chest, palms and soles, dizziness, dry eyes, string-like and fine pulse, soreness and weakness of lumbus and knees, palpitations, reddened cheeks, and tinnitus(all P < 0.05). The ratio of non-dipper hypertension in YX group was higher than in NYX group(56.9% vs 44.4%, P = 0.004). Compared with NYX group, 24 h DBP standard deviation(SD), nighttime DBP SD, Si-Shi DBP, Si-Shi mean arterial pressure(MAP), Hi-Shi systolic blood pressure(SBP), Hi-Shi DBP, Hi-Shi MAP, Zi-Shi SBP, Zi-Shi DBP, Zi-Shi MAP, ChouShi SBP SD, Chou-Shi DBP SD, Chou-Shi SBP coefficient of variation(CV) were lower in YX group(all P < 0.05). Binary Logistic Regression analysis showed that the diagnosis of YDSH was positively correlated with age, heart rate, YDSH rating scores, and four TCM symptoms including vexing heat in the chest, palms and soles, string-like and fine pulse, soreness and weakness of lumbus and knees, and reddened cheeks(all P < 0.05), but was negatively correlated with smoking(P﹥0.05). In addition, the diagnosis of YDSH was positively correlated with daytime SBP SD, nighttime SBP SD, nighttime SBP CV, and Hi-Shi SBP CV, but was negatively correlated with 24 h SBP CV, daytime DBP SD, nighttime DBP SD, and Hi-Shi DBP(all P < 0.05). Hi-Shi SBP CV had independent and positive correlation with the diagnosis of YDSH after adjusting the variables of age, gender, course of hypertension, BMI, waist circumference, SBP, DBP, heart rate, smoking and drinking(P = 0.029). Diagnostic model of YDSH was established and verified based on the random forest. The results showed that the calculation accuracy, specificity and sensitivity were 77.3%, 77.8% and 76.9%, respectively. CONCLUSION: The BPR was significantly attenuated in YDSH patients, including lower 24 h DBP SD and nighttime DBP SD, and Hi-Shi SBP CV is independently correlated with the diagnosis of YDSH. The prediction accuracy of diagnosis model of YDSH based on the random forest was good, which could be valuable for clinicians to differentiate YDSH and non-Yin deficiency patients for more effective hypertensive treatment of TCM.展开更多
As massive underground projects have become popular in dense urban cities,a problem has arisen:which model predicts the best for Tunnel Boring Machine(TBM)performance in these tunneling projects?However,performance le...As massive underground projects have become popular in dense urban cities,a problem has arisen:which model predicts the best for Tunnel Boring Machine(TBM)performance in these tunneling projects?However,performance level of TBMs in complex geological conditions is still a great challenge for practitioners and researchers.On the other hand,a reliable and accurate prediction of TBM performance is essential to planning an applicable tunnel construction schedule.The performance of TBM is very difficult to estimate due to various geotechnical and geological factors and machine specifications.The previously-proposed intelligent techniques in this field are mostly based on a single or base model with a low level of accuracy.Hence,this study aims to introduce a hybrid randomforest(RF)technique optimized by global harmony search with generalized oppositionbased learning(GOGHS)for forecasting TBM advance rate(AR).Optimizing the RF hyper-parameters in terms of,e.g.,tree number and maximum tree depth is the main objective of using the GOGHS-RF model.In the modelling of this study,a comprehensive databasewith themost influential parameters onTBMtogetherwithTBM AR were used as input and output variables,respectively.To examine the capability and power of the GOGHSRF model,three more hybrid models of particle swarm optimization-RF,genetic algorithm-RF and artificial bee colony-RF were also constructed to forecast TBM AR.Evaluation of the developed models was performed by calculating several performance indices,including determination coefficient(R2),root-mean-square-error(RMSE),and mean-absolute-percentage-error(MAPE).The results showed that theGOGHS-RF is a more accurate technique for estimatingTBMAR compared to the other applied models.The newly-developedGOGHS-RFmodel enjoyed R2=0.9937 and 0.9844,respectively,for train and test stages,which are higher than a pre-developed RF.Also,the importance of the input parameters was interpreted through the SHapley Additive exPlanations(SHAP)method,and it was found that thrust force per cutter is the most important variable on TBMAR.The GOGHS-RF model can be used in mechanized tunnel projects for predicting and checking performance.展开更多
Automatically detecting Ulva prolifera(U.prolifera)in rainy and cloudy weather using remote sensing imagery has been a long-standing problem.Here,we address this challenge by combining high-resolution Synthetic Apertu...Automatically detecting Ulva prolifera(U.prolifera)in rainy and cloudy weather using remote sensing imagery has been a long-standing problem.Here,we address this challenge by combining high-resolution Synthetic Aperture Radar(SAR)imagery with the machine learning,and detect the U.prolifera of the South Yellow Sea of China(SYS)in 2021.The findings indicate that the Random Forest model can accurately and robustly detect U.prolifera,even in the presence of complex ocean backgrounds and speckle noise.Visual inspection confirmed that the method successfully identified the majority of pixels containing U.prolifera without misidentifying noise pixels or seawater pixels as U.prolifera.Additionally,the method demonstrated consistent performance across different im-ages,with an average Area Under Curve(AUC)of 0.930(+0.028).The analysis yielded an overall accuracy of over 96%,with an average Kappa coefficient of 0.941(+0.038).Compared to the traditional thresholding method,Random Forest model has a lower estimation error of 14.81%.Practical application indicates that this method can be used in the detection of unprecedented U.prolifera in 2021 to derive continuous spatiotemporal changes.This study provides a potential new method to detect U.prolifera and enhances our under-standing of macroalgal outbreaks in the marine environment.展开更多
Precise and timely prediction of crop yields is crucial for food security and the development of agricultural policies.However,crop yield is influenced by multiple factors within complex growth environments.Previous r...Precise and timely prediction of crop yields is crucial for food security and the development of agricultural policies.However,crop yield is influenced by multiple factors within complex growth environments.Previous research has paid relatively little attention to the interference of environmental factors and drought on the growth of winter wheat.Therefore,there is an urgent need for more effective methods to explore the inherent relationship between these factors and crop yield,making precise yield prediction increasingly important.This study was based on four type of indicators including meteorological,crop growth status,environmental,and drought index,from October 2003 to June 2019 in Henan Province as the basic data for predicting winter wheat yield.Using the sparrow search al-gorithm combined with random forest(SSA-RF)under different input indicators,accuracy of winter wheat yield estimation was calcu-lated.The estimation accuracy of SSA-RF was compared with partial least squares regression(PLSR),extreme gradient boosting(XG-Boost),and random forest(RF)models.Finally,the determined optimal yield estimation method was used to predict winter wheat yield in three typical years.Following are the findings:1)the SSA-RF demonstrates superior performance in estimating winter wheat yield compared to other algorithms.The best yield estimation method is achieved by four types indicators’composition with SSA-RF)(R^(2)=0.805,RRMSE=9.9%.2)Crops growth status and environmental indicators play significant roles in wheat yield estimation,accounting for 46%and 22%of the yield importance among all indicators,respectively.3)Selecting indicators from October to April of the follow-ing year yielded the highest accuracy in winter wheat yield estimation,with an R^(2)of 0.826 and an RMSE of 9.0%.Yield estimates can be completed two months before the winter wheat harvest in June.4)The predicted performance will be slightly affected by severe drought.Compared with severe drought year(2011)(R^(2)=0.680)and normal year(2017)(R^(2)=0.790),the SSA-RF model has higher prediction accuracy for wet year(2018)(R^(2)=0.820).This study could provide an innovative approach for remote sensing estimation of winter wheat yield.yield.展开更多
基金Funds for the Central Universities(grant number CUC24SG018).
文摘The proliferation of robot accounts on social media platforms has posed a significant negative impact,necessitating robust measures to counter network anomalies and safeguard content integrity.Social robot detection has emerged as a pivotal yet intricate task,aimed at mitigating the dissemination of misleading information.While graphbased approaches have attained remarkable performance in this realm,they grapple with a fundamental limitation:the homogeneity assumption in graph convolution allows social robots to stealthily evade detection by mingling with genuine human profiles.To unravel this challenge and thwart the camouflage tactics,this work proposed an innovative social robot detection framework based on enhanced HOmogeneity and Random Forest(HORFBot).At the core of HORFBot lies a homogeneous graph enhancement strategy,intricately woven with edge-removal techniques,tometiculously dissect the graph intomultiple revealing subgraphs.Subsequently,leveraging the power of contrastive learning,the proposed methodology meticulously trains multiple graph convolutional networks,each honed to discern nuances within these tailored subgraphs.The culminating stage involves the fusion of these feature-rich base classifiers,harmoniously aggregating their insights to produce a comprehensive detection outcome.Extensive experiments on three social robot detection datasets have shown that this method effectively improves the accuracy of social robot detection and outperforms comparative methods.
基金National Natural Science Foundation of China(Grant No.42274180)National Key Research and Development Program of China(2021YFC2902003).
文摘Evaluation of water richness in sandstone is an important research topic in the prevention and control of mine water disasters,and the water richness in sandstone is closely related to its porosity.The refl ection seismic exploration data have high-density spatial sampling information,which provides an important data basis for the prediction of sandstone porosity in coal seam roofs by using refl ection seismic data.First,the basic principles of the variational mode decomposition(VMD)method and the random forest method are introduced.Then,the geological model of coal seam roof sandstone is constructed,seismic forward modeling is conducted,and random noise is added.The decomposition eff ects of the empirical mode decomposition(EMD)method and VMD method on noisy signals are compared and analyzed.The test results show that the firstorder intrinsic mode functions(IMF1)and IMF2 decomposed by the VMD method contain the main eff ective components of seismic signals.A prediction process of sandstone porosity in coal seam roofs based on the combination of VMD and random forest method is proposed.The feasibility and eff ectiveness of the method are verified by trial calculation in the porosity prediction of model data.Taking the actual coalfield refl ection seismic data as an example,the sandstone porosity of the 8 coal seam roof is predicted.The application results show the potential application value of the new porosity prediction method proposed in this study.This method has important theoretical guiding significance for evaluating water richness in coal seam roof sandstone and the prevention and control of mine water disasters.
文摘To improve the efficiency of air quality analysis and the accuracy of predictions, this paper proposes a composite method based on Vector Autoregressive (VAR) and Random Forest (RF) models. In the theoretical section, the model introduction and estimation algorithms are provided. In the empirical analysis section, global air quality data from 2022 to 2024 are used, and the proposed method is applied. Specifically, principal component analysis (PCA) is first conducted, and then VAR and Random Forest methods are used for prediction on the reduced-dimensional data. The results show that the RMSE of the hybrid model is 45.27, significantly lower than the 49.11 of the VAR model alone, verifying its superiority. The stability and predictive performance of the model are effectively enhanced.
基金funded by Institutional Fund Projects under grant no.(IFPDP-261-22)。
文摘Detecting cyber attacks in networks connected to the Internet of Things(IoT)is of utmost importance because of the growing vulnerabilities in the smart environment.Conventional models,such as Naive Bayes and support vector machine(SVM),as well as ensemble methods,such as Gradient Boosting and eXtreme gradient boosting(XGBoost),are often plagued by high computational costs,which makes it challenging for them to perform real-time detection.In this regard,we suggested an attack detection approach that integrates Visual Geometry Group 16(VGG16),Artificial Rabbits Optimizer(ARO),and Random Forest Model to increase detection accuracy and operational efficiency in Internet of Things(IoT)networks.In the suggested model,the extraction of features from malware pictures was accomplished with the help of VGG16.The prediction process is carried out by the random forest model using the extracted features from the VGG16.Additionally,ARO is used to improve the hyper-parameters of the random forest model of the random forest.With an accuracy of 96.36%,the suggested model outperforms the standard models in terms of accuracy,F1-score,precision,and recall.The comparative research highlights our strategy’s success,which improves performance while maintaining a lower computational cost.This method is ideal for real-time applications,but it is effective.
基金Guangdong Innovation and Entrepreneurship Training Programme for Undergraduates“Automatic Classification and Identification of Fraudulent Websites Based on Machine Learning”(Project No.:DC2023125)。
文摘This paper explores the synergistic effect of a model combining Elastic Net and Random Forest in online fraud detection.The study selects a public network dataset containing 1781 data records,divides the dataset by 70%for training and 30%for validation,and analyses the correlation between features using a correlation matrix.The experimental results show that the Elastic Net feature selection method generally outperforms PCA in all models,especially when combined with the Random Forest and XGBoost models,and the ElasticNet+Random Forest model achieves the highest accuracy of 0.968 and AUC value of 0.983,while the Kappa and MCC also reached 0.839 and 0.844 respectively,showing extremely high consistency and correlation.This indicates that combining Elastic Net feature selection and Random Forest model has significant performance advantages in online fraud detection.
文摘The Darjeeling Himalayan region,characterized by its complex topography and vulnerability to multiple environmental hazards,faces significant challenges including landslides,earthquakes,flash floods,and soil loss that critically threaten ecosystem stability.Among these challenges,soil erosion emerges as a silent disaster-a gradual yet relentless process whose impacts accumulate over time,progressively degrading landscape integrity and disrupting ecological sustainability.Unlike catastrophic events with immediate visibility,soil erosion’s most devastating consequences often manifest decades later through diminished agricultural productivity,habitat fragmentation,and irreversible biodiversity loss.This study developed a scalable predictive framework employing Random Forest(RF)and Gradient Boosting Tree(GBT)machine learning models to assess and map soil erosion susceptibility across the region.A comprehensive geo-database was developed incorporating 11 erosion triggering factors:slope,elevation,rainfall,drainage density,topographic wetness index,normalized difference vegetation index,curvature,soil texture,land use,geology,and aspect.A total of 2,483 historical soil erosion locations were identified and randomly divided into two sets:70%for model building and 30%for validation purposes.The models revealed distinct spatial patterns of erosion risks,with GBT classifying 60.50%of the area as very low susceptibility,while RF identified 28.92%in this category.Notable differences emerged in high-risk zone identification,with GBT highlighting 7.42%and RF indicating 2.21%as very high erosion susceptibility areas.Both models demonstrated robust predictive capabilities,with GBT achieving 80.77%accuracy and 0.975 AUC,slightly outperforming RF’s 79.67%accuracy and 0.972 AUC.Analysis of predictor variables identified elevation,slope,rainfall and NDVI as the primary factors influencing erosion susceptibility,highlighting the complex interrelationship between geo-environmental factors and erosion processes.This research offers a strategic framework for targeted conservation and sustainable land management in the fragile Himalayan region,providing valuable insights to help policymakers implement effective soil erosion mitigation strategies and support long-term environmental sustainability.
基金supported by the National Natural Science Foundation of China[42030109,42074012]the Scientific Study Project for institutes of Higher Learning,Ministry of Education,Liaoning Province[LJKMZ20220673]+2 种基金the Project supported by the State Key Laboratory of Geodesy and Earths'Dynamics,Innovation Academy for Precision Measurement Science and Technology[SKLGED2023-3-2]Liaoning Revitalization Talent Program[XLYC2203162]Natural Science Foundation of Hebei Province in China[D2023402024].
文摘Zenith wet delay(ZWD)is a key parameter for the precise positioning of global navigation satellite systems(GNSS)and occupies a central role in meteorological research.Currently,most models only consider the periodic variability of the ZWD,neglecting the effect of nonlinear factors on the ZWD estimation.This oversight results in a limited capability to reflect the rapid fluctuations of the ZWD.To more accurately capture and predict complicated variations in ZWD,this paper developed the CRZWD model by a combination of the GPT3 model and random forests(RF)algorithm using 5-year atmospheric profiles from 70 radiosonde(RS)stations across China.Taking the external 25 test stations data as reference,the root mean square(RMS)of the CRZWD model is 29.95 mm.Compared with the GPT3 model and another model using backpropagation neural network(BPNN),the accuracy has improved by 24.7%and 15.9%,respectively.Notably,over 56%of the test stations exhibit an improvement of more than 20%in contrast to GPT3-ZWD.Further temporal and spatial characteristic analyses also demonstrate the significant accuracy and stability advantages of the CRZWD model,indicating the potential prospects for GNSS-based applications.
基金supported by Science and Technology Project Managed by the State Grid Jiangsu Electric Power Co.,Ltd.(No.J2024163).
文摘With the popularization of microgrid construction and the connection of renewable energy sources to the power system,the problem of source and load uncertainty faced by the coordinated operation of multi-microgrid is becoming increasingly prominent,and the accuracy of typical scenario predictions is low.In order to improve the accuracy of scenario prediction under source and load uncertainty,this paper proposes a typical scenario identification model based on random forests and order parameters.Firstly,a method for ordinal parameter identification and quantification is provided for the coordinated operating mode of multi-microgrids,taking into account source-load uncertainty.Secondly,the dynamic change characteristics of the order parameters of the daily load curve,wind and solar curve,and load curve of typical scenarios are statistically analyzed to identify the key order parameters that have the most significant impact on the uncertainty of the load.Then,the order parameters and seasonal distribution are used as features to train a random forest classification model to achieve efficient scenario prediction.Finally,the simulation of actual data from a provincial distribution network shows that the proposed method can accurately classify typical scenarios with an accuracy rate of 92.7%.Additionally,sensitivity analysis is conducted to assess how changes in uncertainty levels affect the importance of each order parameter,allowing for adaptive uncertainty mitigation strategies.
基金support of the project from the National Key R&D Program of China,Research and Application of Sensing System for Cross-regional Complex Oil&Gas Pipeline Network Safe and Efficiency Operational Status Monitoring(Grant No.2022YFB3207603).
文摘One of the core works of analyzing Electrochemical Impedance Spectroscopy(EIS)data is to select an appropriate equivalent circuit model to quantify the parameters of the electrochemical reaction process.However,this process often relies on human experience and judgment,which will introduce subjectivity and error.In this paper,an intelligent approach is proposed for matching EIS data to their equivalent circuits based on the Random Forest algorithm.It can automatically select the most suitable equivalent circuit model based on the characteristics and patterns of EIS data.Addressing the typical scenario of metal corrosion,an atmospheric corrosion EIS dataset of low-carbon steel is constructed in this paper,which includes five different corrosion scenarios.This dataset was used to validate and evaluate the pro-posed method in this paper.The contributions of this paper can be summarized in three aspects:(1)This paper proposes a method for selecting equivalent circuit models for EIS data based on the Random Forest algorithm.(2)Using authentic EIS data collected from metal atmospheric corrosion,the paper es-tablishes a dataset encompassing five categories of metal corrosion scenarios.(3)The superiority of the proposed method is validated through the utilization of the established authentic EIS dataset.The ex-periment results demonstrate that,in terms of equivalent circuit matching,this method surpasses other machine learning algorithms in both precision and robustness.Furthermore,it shows strong applicability in the analysis of EIS data.
基金supported by the National Natural Science Foundation of China(32273037 and 32102636)the Guangdong Major Project of Basic and Applied Basic Research(2020B0301030007)+4 种基金Laboratory of Lingnan Modern Agriculture Project(NT2021007)the Guangdong Science and Technology Innovation Leading Talent Program(2019TX05N098)the 111 Center(D20008)the double first-class discipline promotion project(2023B10564003)the Department of Education of Guangdong Province(2019KZDXM004 and 2019KCXTD001).
文摘A switch from avian-typeα-2,3 to human-typeα-2,6 receptors is an essential element for the initiation of a pandemic from an avian influenza virus.Some H9N2 viruses exhibit a preference for binding to human-typeα-2,6 receptors.This identifies their potential threat to public health.However,our understanding of the molecular basis for the switch of receptor preference is still limited.In this study,we employed the random forest algorithm to identify the potentially key amino acid sites within hemagglutinin(HA),which are associated with the receptor binding ability of H9N2 avian influenza virus(AIV).Subsequently,these sites were further verified by receptor binding assays.A total of 12 substitutions in the HA protein(N158D,N158S,A160 N,A160D,A160T,T163I,T163V,V190T,V190A,D193 N,D193G,and N231D)were predicted to prefer binding toα-2,6 receptors.Except for the V190T substitution,the other substitutions were demonstrated to display an affinity for preferential binding toα-2,6 receptors by receptor binding assays.Especially,the A160T substitution caused a significant upregulation of immune-response genes and an increased mortality rate in mice.Our findings provide novel insights into understanding the genetic basis of receptor preference of the H9N2 AIV.
文摘The agricultural Internet of Things(IoT)system is a critical component of modern smart agriculture,and its security risk assessment methods have garnered increasing attention from the industry.Current agricultural IoT security risk assessment methods primarily rely on expert judgment,introducing subjective factors that reduce the credibility of the assessment results.To address this issue,this study constructed a dataset for agricultural IoT security risk assessment based on real-world security reports.A PCARF algorithm,built on random forest principles,was proposed,incorporating ensemble learning strategies to enhance prediction accuracy.Compared to the second-best model,the proposed model demonstrated a 2.7%increase in accuracy,a 3.4%improvement in recall rate,a 3.1%rise in Area Under the Curve(AUC),and a 7.9%boost in Matthews Correlation Coefficient(MCC).Extensive comparative experiments showed that the proposed model outperforms others in prediction accuracy and robustness.
基金supported by the Startup Grant(PG18929)awarded to F.Shokoohi.
文摘Accurate Electric Load Forecasting(ELF)is crucial for optimizing production capacity,improving operational efficiency,and managing energy resources effectively.Moreover,precise ELF contributes to a smaller environmental footprint by reducing the risks of disruption,downtime,and waste.However,with increasingly complex energy consumption patterns driven by renewable energy integration and changing consumer behaviors,no single approach has emerged as universally effective.In response,this research presents a hybrid modeling framework that combines the strengths of Random Forest(RF)and Autoregressive Integrated Moving Average(ARIMA)models,enhanced with advanced feature selection—Minimum Redundancy Maximum Relevancy and Maximum Synergy(MRMRMS)method—to produce a sparse model.Additionally,the residual patterns are analyzed to enhance forecast accuracy.High-resolution weather data from Weather Underground and historical energy consumption data from PJM for Duke Energy Ohio and Kentucky(DEO&K)are used in this application.This methodology,termed SP-RF-ARIMA,is evaluated against existing approaches;it demonstrates more than 40%reduction in mean absolute error and root mean square error compared to the second-best method.
基金support in providing the data and the Universiti Teknologi Malaysia supported this work under UTM Flagship CoE/RG-Coe/RG 5.2:Evaluating Surface PGA with Global Ground Motion Site Response Analyses for the highest seismic activity location in Peninsular Malaysia(Q.J130000.5022.10G47)Universiti Teknologi Malaysia-Earthquake Hazard Assessment in Peninsular Malaysia Using Probabilistic Seismic Hazard Analysis(PSHA)Method(Q.J130000.21A2.06E9).
文摘The prediction of slope stability is a complex nonlinear problem.This paper proposes a new method based on the random forest(RF)algorithm to study the rocky slopes stability.Taking the Bukit Merah,Perak and Twin Peak(Kuala Lumpur)as the study area,the slope characteristics of geometrical parameters are obtained from a multidisciplinary approach(consisting of geological,geotechnical,and remote sensing analyses).18 factors,including rock strength,rock quality designation(RQD),joint spacing,continuity,openness,roughness,filling,weathering,water seepage,temperature,vegetation index,water index,and orientation,are selected to construct model input variables while the factor of safety(FOS)functions as an output.The area under the curve(AUC)value of the receiver operating characteristic(ROC)curve is obtained with precision and accuracy and used to analyse the predictive model ability.With a large training set and predicted parameters,an area under the ROC curve(the AUC)of 0.95 is achieved.A precision score of 0.88 is obtained,indicating that the model has a low false positive rate and correctly identifies a substantial number of true positives.The findings emphasise the importance of using a variety of terrain characteristics and different approaches to characterise the rock slope.
基金financially supported by the National Natural Science Foundation of China(No.52174001)the National Natural Science Foundation of China(No.52004064)+1 种基金the Hainan Province Science and Technology Special Fund “Research on Real-time Intelligent Sensing Technology for Closed-loop Drilling of Oil and Gas Reservoirs in Deepwater Drilling”(ZDYF2023GXJS012)Heilongjiang Provincial Government and Daqing Oilfield's first batch of the scientific and technological key project “Research on the Construction Technology of Gulong Shale Oil Big Data Analysis System”(DQYT-2022-JS-750)。
文摘Real-time intelligent lithology identification while drilling is vital to realizing downhole closed-loop drilling. The complex and changeable geological environment in the drilling makes lithology identification face many challenges. This paper studies the problems of difficult feature information extraction,low precision of thin-layer identification and limited applicability of the model in intelligent lithologic identification. The author tries to improve the comprehensive performance of the lithology identification model from three aspects: data feature extraction, class balance, and model design. A new real-time intelligent lithology identification model of dynamic felling strategy weighted random forest algorithm(DFW-RF) is proposed. According to the feature selection results, gamma ray and 2 MHz phase resistivity are the logging while drilling(LWD) parameters that significantly influence lithology identification. The comprehensive performance of the DFW-RF lithology identification model has been verified in the application of 3 wells in different areas. By comparing the prediction results of five typical lithology identification algorithms, the DFW-RF model has a higher lithology identification accuracy rate and F1 score. This model improves the identification accuracy of thin-layer lithology and is effective and feasible in different geological environments. The DFW-RF model plays a truly efficient role in the realtime intelligent identification of lithologic information in closed-loop drilling and has greater applicability, which is worthy of being widely used in logging interpretation.
基金supported by the Basic Science Center Program for Multiphase Evolution in Hyper Gravity of the National Natural Science Foundation of China(No.51988101)the National Natural Science Foundation of China(No.52178306)the Zhejiang Provincial Natural Science Foundation of China(No.LR19E080002).
文摘During construction,the shield linings of tunnels often face the problem of local or overall upward movement after leaving the shield tail in soft soil areas or during some large diameter shield projects.Differential floating will increase the initial stress on the segments and bolts which is harmful to the service performance of the tunnel.In this study we used a random forest(RF)algorithm combined particle swarm optimization(PSO)and 5-fold cross-validation(5-fold CV)to predict the maximum upward displacement of tunnel linings induced by shield tunnel excavation.The mechanism and factors causing upward movement of the tunnel lining are comprehensively summarized.Twelve input variables were selected according to results from analysis of influencing factors.The prediction performance of two models,PSO-RF and RF(default)were compared.The Gini value was obtained to represent the relative importance of the influencing factors to the upward displacement of linings.The PSO-RF model successfully predicted the maximum upward displacement of the tunnel linings with a low error(mean absolute error(MAE)=4.04 mm,root mean square error(RMSE)=5.67 mm)and high correlation(R^(2)=0.915).The thrust and depth of the tunnel were the most important factors in the prediction model influencing the upward displacement of the tunnel linings.
基金National Key R&D Program of China Project:Study on Syndrome Differentiation Standard of Yin deficiency Syndrome in Hypertension (No. 2018YFC1704403)National Key R&D Program of China Project:Systematic Study on the Standard of Syndrome Differentiation of Yin Deficiency Syndrome (No. 2018YFC1704400)+1 种基金the Natural Science Foundation of Jiangsu Province:Exploring the Cardioprotective Effect and Mechanism of Qinggan Zishen Formula on Obesity and Hypertension Based on Nrf2 Regulation of Cardiac Homeostasis (No. BK20221422)the Natural Science Foundation of Jiangsu Province:Mechanism Study on the Promotion of Cardiac Energy Metabolism Balance and Inhibition of DOX Induced Heart Failure through Nr1d1/Nfil3 Mediated Circadian Pathway by Yiqi Wenyang Formula (No. BK20220739)。
文摘OBJECTIVE: To investigate blood pressure rhythm(BPR)in Yin deficiency syndrome of hypertension(YDSH)patients and develop a random forest model for predicting YDSH.METHODS: Our study was consistent with technical processes and specification for developing guidelines of Evidence-based Chinese medicine clinical practice(T/CACM 1032-2017). We enrolled 234 patients who had been diagnosed with primary hypertension without antihypertensive medications prior to the enrollment. All participants were divided into Yin deficiency group(YX, n = 74) and non-Yin deficiency group(NYX, n = 160).Participants were professionally grouped by three experienced chief Traditional Chinese Medicine(TCM)physicians according to four examinations(i.e.,inspection, listening and smelling, inquiry and palpation).We collected data on 24 h ambulatory blood pressure monitoring(ABPM) and YDSH rating scale. We divided 24 h of a day into 12 two-hour periods [Chen-Shi(7:00-9:00), Si-Shi(9:00-11:00), Wu-Shi(11:00-13:00), Wei-Shi(13:00-15:00), Shen-Shi(15:00-17:00), You-Shi(17:00-19:00), Xu-Shi(19:00-21:00), Hai-Shi(21:00-23:00), ZiShi(23:00-1:00), Chou-Shi(1:00-3:00), Yin-Shi(3:00-5:00), Mao-Shi(5:00-7:00)] according to the theory of “midnight-midday ebb flow”. We used random forest to build the diagnostic model of YDSH, with whether it was Yin deficiency syndrome as the outcome. RESULTS: Compared with NYX group, YX group had more female participants with older age, lower waist circumference, body mass index(BMI), diastolic blood pressure(DBP), and smoking and drinking rate(all P < 0.05). The YDSH rating scores of YX group [28.5(21.0-36.0)] were significantly higher than NYX group [13.0(8.0-22.0)](P < 0.001), and the typical symptoms of YX group included vexing heat in the chest, palms and soles, dizziness, dry eyes, string-like and fine pulse, soreness and weakness of lumbus and knees, palpitations, reddened cheeks, and tinnitus(all P < 0.05). The ratio of non-dipper hypertension in YX group was higher than in NYX group(56.9% vs 44.4%, P = 0.004). Compared with NYX group, 24 h DBP standard deviation(SD), nighttime DBP SD, Si-Shi DBP, Si-Shi mean arterial pressure(MAP), Hi-Shi systolic blood pressure(SBP), Hi-Shi DBP, Hi-Shi MAP, Zi-Shi SBP, Zi-Shi DBP, Zi-Shi MAP, ChouShi SBP SD, Chou-Shi DBP SD, Chou-Shi SBP coefficient of variation(CV) were lower in YX group(all P < 0.05). Binary Logistic Regression analysis showed that the diagnosis of YDSH was positively correlated with age, heart rate, YDSH rating scores, and four TCM symptoms including vexing heat in the chest, palms and soles, string-like and fine pulse, soreness and weakness of lumbus and knees, and reddened cheeks(all P < 0.05), but was negatively correlated with smoking(P﹥0.05). In addition, the diagnosis of YDSH was positively correlated with daytime SBP SD, nighttime SBP SD, nighttime SBP CV, and Hi-Shi SBP CV, but was negatively correlated with 24 h SBP CV, daytime DBP SD, nighttime DBP SD, and Hi-Shi DBP(all P < 0.05). Hi-Shi SBP CV had independent and positive correlation with the diagnosis of YDSH after adjusting the variables of age, gender, course of hypertension, BMI, waist circumference, SBP, DBP, heart rate, smoking and drinking(P = 0.029). Diagnostic model of YDSH was established and verified based on the random forest. The results showed that the calculation accuracy, specificity and sensitivity were 77.3%, 77.8% and 76.9%, respectively. CONCLUSION: The BPR was significantly attenuated in YDSH patients, including lower 24 h DBP SD and nighttime DBP SD, and Hi-Shi SBP CV is independently correlated with the diagnosis of YDSH. The prediction accuracy of diagnosis model of YDSH based on the random forest was good, which could be valuable for clinicians to differentiate YDSH and non-Yin deficiency patients for more effective hypertensive treatment of TCM.
基金the National Natural Science Foundation of China(Grant 42177164)the Distinguished Youth Science Foundation of Hunan Province of China(2022JJ10073).
文摘As massive underground projects have become popular in dense urban cities,a problem has arisen:which model predicts the best for Tunnel Boring Machine(TBM)performance in these tunneling projects?However,performance level of TBMs in complex geological conditions is still a great challenge for practitioners and researchers.On the other hand,a reliable and accurate prediction of TBM performance is essential to planning an applicable tunnel construction schedule.The performance of TBM is very difficult to estimate due to various geotechnical and geological factors and machine specifications.The previously-proposed intelligent techniques in this field are mostly based on a single or base model with a low level of accuracy.Hence,this study aims to introduce a hybrid randomforest(RF)technique optimized by global harmony search with generalized oppositionbased learning(GOGHS)for forecasting TBM advance rate(AR).Optimizing the RF hyper-parameters in terms of,e.g.,tree number and maximum tree depth is the main objective of using the GOGHS-RF model.In the modelling of this study,a comprehensive databasewith themost influential parameters onTBMtogetherwithTBM AR were used as input and output variables,respectively.To examine the capability and power of the GOGHSRF model,three more hybrid models of particle swarm optimization-RF,genetic algorithm-RF and artificial bee colony-RF were also constructed to forecast TBM AR.Evaluation of the developed models was performed by calculating several performance indices,including determination coefficient(R2),root-mean-square-error(RMSE),and mean-absolute-percentage-error(MAPE).The results showed that theGOGHS-RF is a more accurate technique for estimatingTBMAR compared to the other applied models.The newly-developedGOGHS-RFmodel enjoyed R2=0.9937 and 0.9844,respectively,for train and test stages,which are higher than a pre-developed RF.Also,the importance of the input parameters was interpreted through the SHapley Additive exPlanations(SHAP)method,and it was found that thrust force per cutter is the most important variable on TBMAR.The GOGHS-RF model can be used in mechanized tunnel projects for predicting and checking performance.
基金Under the auspices of National Natural Science Foundation of China(No.42071385)National Science and Technology Major Project of High Resolution Earth Observation System(No.79-Y50-G18-9001-22/23)。
文摘Automatically detecting Ulva prolifera(U.prolifera)in rainy and cloudy weather using remote sensing imagery has been a long-standing problem.Here,we address this challenge by combining high-resolution Synthetic Aperture Radar(SAR)imagery with the machine learning,and detect the U.prolifera of the South Yellow Sea of China(SYS)in 2021.The findings indicate that the Random Forest model can accurately and robustly detect U.prolifera,even in the presence of complex ocean backgrounds and speckle noise.Visual inspection confirmed that the method successfully identified the majority of pixels containing U.prolifera without misidentifying noise pixels or seawater pixels as U.prolifera.Additionally,the method demonstrated consistent performance across different im-ages,with an average Area Under Curve(AUC)of 0.930(+0.028).The analysis yielded an overall accuracy of over 96%,with an average Kappa coefficient of 0.941(+0.038).Compared to the traditional thresholding method,Random Forest model has a lower estimation error of 14.81%.Practical application indicates that this method can be used in the detection of unprecedented U.prolifera in 2021 to derive continuous spatiotemporal changes.This study provides a potential new method to detect U.prolifera and enhances our under-standing of macroalgal outbreaks in the marine environment.
基金Under the auspices of National Natural Science Foundation of China(No.52079103)。
文摘Precise and timely prediction of crop yields is crucial for food security and the development of agricultural policies.However,crop yield is influenced by multiple factors within complex growth environments.Previous research has paid relatively little attention to the interference of environmental factors and drought on the growth of winter wheat.Therefore,there is an urgent need for more effective methods to explore the inherent relationship between these factors and crop yield,making precise yield prediction increasingly important.This study was based on four type of indicators including meteorological,crop growth status,environmental,and drought index,from October 2003 to June 2019 in Henan Province as the basic data for predicting winter wheat yield.Using the sparrow search al-gorithm combined with random forest(SSA-RF)under different input indicators,accuracy of winter wheat yield estimation was calcu-lated.The estimation accuracy of SSA-RF was compared with partial least squares regression(PLSR),extreme gradient boosting(XG-Boost),and random forest(RF)models.Finally,the determined optimal yield estimation method was used to predict winter wheat yield in three typical years.Following are the findings:1)the SSA-RF demonstrates superior performance in estimating winter wheat yield compared to other algorithms.The best yield estimation method is achieved by four types indicators’composition with SSA-RF)(R^(2)=0.805,RRMSE=9.9%.2)Crops growth status and environmental indicators play significant roles in wheat yield estimation,accounting for 46%and 22%of the yield importance among all indicators,respectively.3)Selecting indicators from October to April of the follow-ing year yielded the highest accuracy in winter wheat yield estimation,with an R^(2)of 0.826 and an RMSE of 9.0%.Yield estimates can be completed two months before the winter wheat harvest in June.4)The predicted performance will be slightly affected by severe drought.Compared with severe drought year(2011)(R^(2)=0.680)and normal year(2017)(R^(2)=0.790),the SSA-RF model has higher prediction accuracy for wet year(2018)(R^(2)=0.820).This study could provide an innovative approach for remote sensing estimation of winter wheat yield.yield.