A switch from avian-typeα-2,3 to human-typeα-2,6 receptors is an essential element for the initiation of a pandemic from an avian influenza virus.Some H9N2 viruses exhibit a preference for binding to human-typeα-2,...A switch from avian-typeα-2,3 to human-typeα-2,6 receptors is an essential element for the initiation of a pandemic from an avian influenza virus.Some H9N2 viruses exhibit a preference for binding to human-typeα-2,6 receptors.This identifies their potential threat to public health.However,our understanding of the molecular basis for the switch of receptor preference is still limited.In this study,we employed the random forest algorithm to identify the potentially key amino acid sites within hemagglutinin(HA),which are associated with the receptor binding ability of H9N2 avian influenza virus(AIV).Subsequently,these sites were further verified by receptor binding assays.A total of 12 substitutions in the HA protein(N158D,N158S,A160 N,A160D,A160T,T163I,T163V,V190T,V190A,D193 N,D193G,and N231D)were predicted to prefer binding toα-2,6 receptors.Except for the V190T substitution,the other substitutions were demonstrated to display an affinity for preferential binding toα-2,6 receptors by receptor binding assays.Especially,the A160T substitution caused a significant upregulation of immune-response genes and an increased mortality rate in mice.Our findings provide novel insights into understanding the genetic basis of receptor preference of the H9N2 AIV.展开更多
Real-time intelligent lithology identification while drilling is vital to realizing downhole closed-loop drilling. The complex and changeable geological environment in the drilling makes lithology identification face ...Real-time intelligent lithology identification while drilling is vital to realizing downhole closed-loop drilling. The complex and changeable geological environment in the drilling makes lithology identification face many challenges. This paper studies the problems of difficult feature information extraction,low precision of thin-layer identification and limited applicability of the model in intelligent lithologic identification. The author tries to improve the comprehensive performance of the lithology identification model from three aspects: data feature extraction, class balance, and model design. A new real-time intelligent lithology identification model of dynamic felling strategy weighted random forest algorithm(DFW-RF) is proposed. According to the feature selection results, gamma ray and 2 MHz phase resistivity are the logging while drilling(LWD) parameters that significantly influence lithology identification. The comprehensive performance of the DFW-RF lithology identification model has been verified in the application of 3 wells in different areas. By comparing the prediction results of five typical lithology identification algorithms, the DFW-RF model has a higher lithology identification accuracy rate and F1 score. This model improves the identification accuracy of thin-layer lithology and is effective and feasible in different geological environments. The DFW-RF model plays a truly efficient role in the realtime intelligent identification of lithologic information in closed-loop drilling and has greater applicability, which is worthy of being widely used in logging interpretation.展开更多
Precise and timely prediction of crop yields is crucial for food security and the development of agricultural policies.However,crop yield is influenced by multiple factors within complex growth environments.Previous r...Precise and timely prediction of crop yields is crucial for food security and the development of agricultural policies.However,crop yield is influenced by multiple factors within complex growth environments.Previous research has paid relatively little attention to the interference of environmental factors and drought on the growth of winter wheat.Therefore,there is an urgent need for more effective methods to explore the inherent relationship between these factors and crop yield,making precise yield prediction increasingly important.This study was based on four type of indicators including meteorological,crop growth status,environmental,and drought index,from October 2003 to June 2019 in Henan Province as the basic data for predicting winter wheat yield.Using the sparrow search al-gorithm combined with random forest(SSA-RF)under different input indicators,accuracy of winter wheat yield estimation was calcu-lated.The estimation accuracy of SSA-RF was compared with partial least squares regression(PLSR),extreme gradient boosting(XG-Boost),and random forest(RF)models.Finally,the determined optimal yield estimation method was used to predict winter wheat yield in three typical years.Following are the findings:1)the SSA-RF demonstrates superior performance in estimating winter wheat yield compared to other algorithms.The best yield estimation method is achieved by four types indicators’composition with SSA-RF)(R^(2)=0.805,RRMSE=9.9%.2)Crops growth status and environmental indicators play significant roles in wheat yield estimation,accounting for 46%and 22%of the yield importance among all indicators,respectively.3)Selecting indicators from October to April of the follow-ing year yielded the highest accuracy in winter wheat yield estimation,with an R^(2)of 0.826 and an RMSE of 9.0%.Yield estimates can be completed two months before the winter wheat harvest in June.4)The predicted performance will be slightly affected by severe drought.Compared with severe drought year(2011)(R^(2)=0.680)and normal year(2017)(R^(2)=0.790),the SSA-RF model has higher prediction accuracy for wet year(2018)(R^(2)=0.820).This study could provide an innovative approach for remote sensing estimation of winter wheat yield.yield.展开更多
This study investigated the impacts of random negative training datasets(NTDs)on the uncertainty of machine learning models for geologic hazard susceptibility assessment of the Loess Plateau,northern Shaanxi Province,...This study investigated the impacts of random negative training datasets(NTDs)on the uncertainty of machine learning models for geologic hazard susceptibility assessment of the Loess Plateau,northern Shaanxi Province,China.Based on randomly generated 40 NTDs,the study developed models for the geologic hazard susceptibility assessment using the random forest algorithm and evaluated their performances using the area under the receiver operating characteristic curve(AUC).Specifically,the means and standard deviations of the AUC values from all models were then utilized to assess the overall spatial correlation between the conditioning factors and the susceptibility assessment,as well as the uncertainty introduced by the NTDs.A risk and return methodology was thus employed to quantify and mitigate the uncertainty,with log odds ratios used to characterize the susceptibility assessment levels.The risk and return values were calculated based on the standard deviations and means of the log odds ratios of various locations.After the mean log odds ratios were converted into probability values,the final susceptibility map was plotted,which accounts for the uncertainty induced by random NTDs.The results indicate that the AUC values of the models ranged from 0.810 to 0.963,with an average of 0.852 and a standard deviation of 0.035,indicating encouraging prediction effects and certain uncertainty.The risk and return analysis reveals that low-risk and high-return areas suggest lower standard deviations and higher means across multiple model-derived assessments.Overall,this study introduces a new framework for quantifying the uncertainty of multiple training and evaluation models,aimed at improving their robustness and reliability.Additionally,by identifying low-risk and high-return areas,resource allocation for geologic hazard prevention and control can be optimized,thus ensuring that limited resources are directed toward the most effective prevention and control measures.展开更多
The prediction of slope stability is a complex nonlinear problem.This paper proposes a new method based on the random forest(RF)algorithm to study the rocky slopes stability.Taking the Bukit Merah,Perak and Twin Peak(...The prediction of slope stability is a complex nonlinear problem.This paper proposes a new method based on the random forest(RF)algorithm to study the rocky slopes stability.Taking the Bukit Merah,Perak and Twin Peak(Kuala Lumpur)as the study area,the slope characteristics of geometrical parameters are obtained from a multidisciplinary approach(consisting of geological,geotechnical,and remote sensing analyses).18 factors,including rock strength,rock quality designation(RQD),joint spacing,continuity,openness,roughness,filling,weathering,water seepage,temperature,vegetation index,water index,and orientation,are selected to construct model input variables while the factor of safety(FOS)functions as an output.The area under the curve(AUC)value of the receiver operating characteristic(ROC)curve is obtained with precision and accuracy and used to analyse the predictive model ability.With a large training set and predicted parameters,an area under the ROC curve(the AUC)of 0.95 is achieved.A precision score of 0.88 is obtained,indicating that the model has a low false positive rate and correctly identifies a substantial number of true positives.The findings emphasise the importance of using a variety of terrain characteristics and different approaches to characterise the rock slope.展开更多
The aim of this study is to evaluate the ability of the random forest algorithm that combines data on transrectal ultrasound findings, age, and serum levels of prostate-specific antigen to predict prostate carcinoma. ...The aim of this study is to evaluate the ability of the random forest algorithm that combines data on transrectal ultrasound findings, age, and serum levels of prostate-specific antigen to predict prostate carcinoma. Clinico-demographic data were analyzed for 941 patients with prostate diseases treated at our hospital, including age, serum prostate-specific antigen levels, transrectal ultrasound findings, and pathology diagnosis based on ultrasound-guided needle biopsy of the prostate. These data were compared between patients with and without prostate cancer using the Chi-square test, and then entered into the random forest model to predict diagnosis. Patients with and without prostate cancer differed significantly in age and serum prostate-specific antigen levels (P 〈 0.001), as well as in all transrectal ultrasound characteristics (P 〈 0.05) except uneven echo (P = 0.609). The random forest model based on age, prostate-specific antigen and ultrasound predicted prostate cancer with an accuracy of 83.10%, sensitivity of 65.64%, and specificity of 93.83%. Positive predictive value was 86.72%, and negative predictive value was 81.64%. By integrating age, prostate-specific antigen levels and transrectal ultrasound findings, the random forest algorithm shows better diagnostic performance for prostate cancer than either diagnostic indicator on its own. This algorithm may help improve diagnosis of the disease by identifying patients at high risk for biopsy.展开更多
Estimating the volume growth of forest ecosystems accurately is important for understanding carbon sequestration and achieving carbon neutrality goals.However,the key environmental factors affecting volume growth diff...Estimating the volume growth of forest ecosystems accurately is important for understanding carbon sequestration and achieving carbon neutrality goals.However,the key environmental factors affecting volume growth differ across various scales and plant functional types.This study was,therefore,conducted to estimate the volume growth of Larix and Quercus forests based on national-scale forestry inventory data in China and its influencing factors using random forest algorithms.The results showed that the model performances of volume growth in natural forests(R^(2)=0.65 for Larix and 0.66 for Quercus,respectively)were better than those in planted forests(R^(2)=0.44 for Larix and 0.40 for Quercus,respectively).In both natural and planted forests,the stand age showed a strong relative importance for volume growth(8.6%–66.2%),while the edaphic and climatic variables had a limited relative importance(<6.0%).The relationship between stand age and volume growth was unimodal in natural forests and linear increase in planted Quercus forests.And the specific locations(i.e.,altitude and aspect)of sampling plots exhibited high relative importance for volume growth in planted forests(4.1%–18.2%).Altitude positively affected volume growth in planted Larix forests but controlled volume growth negatively in planted Quercus forests.Similarly,the effects of other environmental factors on volume growth also differed in both stand origins(planted versus natural)and plant functional types(Larix versus Quercus).These results highlighted that the stand age was the most important predictor for volume growth and there were diverse effects of environmental factors on volume growth among stand origins and plant functional types.Our findings will provide a good framework for site-specific recommendations regarding the management practices necessary to maintain the volume growth in China's forest ecosystems.展开更多
This paper presents a new framework for object-based classification of high-resolution hyperspectral data.This multi-step framework is based on multi-resolution segmentation(MRS)and Random Forest classifier(RFC)algori...This paper presents a new framework for object-based classification of high-resolution hyperspectral data.This multi-step framework is based on multi-resolution segmentation(MRS)and Random Forest classifier(RFC)algorithms.The first step is to determine of weights of the input features while using the object-based approach with MRS to processing such images.Given the high number of input features,an automatic method is needed for estimation of this parameter.Moreover,we used the Variable Importance(VI),one of the outputs of the RFC,to determine the importance of each image band.Then,based on this parameter and other required parameters,the image is segmented into some homogenous regions.Finally,the RFC is carried out based on the characteristics of segments for converting them into meaningful objects.The proposed method,as well as,the conventional pixel-based RFC and Support Vector Machine(SVM)method was applied to three different hyperspectral data-sets with various spectral and spatial characteristics.These data were acquired by the HyMap,the Airborne Prism Experiment(APEX),and the Compact Airborne Spectrographic Imager(CASI)hyperspectral sensors.The experimental results show that the proposed method is more consistent for land cover mapping in various areas.The overall classification accuracy(OA),obtained by the proposed method was 95.48,86.57,and 84.29%for the HyMap,the APEX,and the CASI datasets,respectively.Moreover,this method showed better efficiency in comparison to the spectralbased classifications because the OAs of the proposed method was 5.67 and 3.75%higher than the conventional RFC and SVM classifiers,respectively.展开更多
The random forest algorithm was applied to study the nuclear binding energy and charge radius.The regularized root-mean-square of error(RMSE)was proposed to avoid overfitting during the training of random forest.RMSE ...The random forest algorithm was applied to study the nuclear binding energy and charge radius.The regularized root-mean-square of error(RMSE)was proposed to avoid overfitting during the training of random forest.RMSE for nuclides with Z,N>7 is reduced to 0.816 MeV and 0.0200 fm compared with the six-term liquid drop model and a three-term nuclear charge radius formula,respectively.Specific interest is in the possible(sub)shells among the superheavy region,which is important for searching for new elements and the island of stability.The significance of shell features estimated by the so-called shapely additive explanation method suggests(Z,N)=(92,142)and(98,156)as possible subshells indicated by the binding energy.Because the present observed data is far from the N=184 shell,which is suggested by mean-field investigations,its shell effect is not predicted based on present training.The significance analysis of the nuclear charge radius suggests Z=92 and N=136 as possible subshells.The effect is verified by the shell-corrected nuclear charge radius model.展开更多
Given the challenge of estimating or calculating quantities of waste electrical and electronic equipment(WEEE)in developing countries,this article focuses on predicting the WEEE generated by Cameroonian small and medi...Given the challenge of estimating or calculating quantities of waste electrical and electronic equipment(WEEE)in developing countries,this article focuses on predicting the WEEE generated by Cameroonian small and medium enterprises(SMEs)that are engaged in ISO 14001:2015 initiatives and consume electrical and electronic equipment(EEE)to enhance their performance and profitability.The methodology employed an exploratory approach involving the application of general equilibrium theory(GET)to contextualize the study and generate relevant parameters for deploying the random forest regression learning algorithm for predictions.Machine learning was applied to 80%of the samples for training,while simulation was conducted on the remaining 20%of samples based on quantities of EEE utilized over a specific period,utilization rates,repair rates,and average lifespans.The results demonstrate that the model’s predicted values are significantly close to the actual quantities of generated WEEE,and the model’s performance was evaluated using the mean squared error(MSE)and yielding satisfactory results.Based on this model,both companies and stakeholders can set realistic objectives for managing companies’WEEE,fostering sustainable socio-environmental practices.展开更多
In materials science,data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates.Symbolic regression is a key to extracting material descriptors from large datas...In materials science,data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates.Symbolic regression is a key to extracting material descriptors from large datasets,in particular the Sure Independence Screening and Sparsifying Operator(SISSO)method.While SISSO needs to store the entire expression space to impose heavy memory demands,it limits the performance in complex problems.To address this issue,we propose a RF-SISSO algorithm by combining Random Forests(RF)with SISSO.In this algorithm,the Random Forests algorithm is used for prescreening,capturing non-linear relationships and improving feature selection,which may enhance the quality of the input data and boost the accuracy and efficiency on regression and classification tasks.For a testing on the SISSO’s verification problem for 299 materials,RF-SISSO demonstrates its robust performance and high accuracy.RF-SISSO can maintain the testing accuracy above 0.9 across all four training sample sizes and significantly enhancing regression efficiency,especially in training subsets with smaller sample sizes.For the training subset with 45 samples,the efficiency of RF-SISSO was 265 times higher than that of original SISSO.As collecting large datasets would be both costly and time-consuming in the practical experiments,it is thus believed that RF-SISSO may benefit scientific researches by offering a high predicting accuracy with limited data efficiently.展开更多
The return of crop residues to cultivated fields has numerous agronomic and soil quality benefits and,therefore,the areal extent of crop residue cover(CRC)could provide a rapid measure of the sustainability of agricul...The return of crop residues to cultivated fields has numerous agronomic and soil quality benefits and,therefore,the areal extent of crop residue cover(CRC)could provide a rapid measure of the sustainability of agricultural production systems in a region.Recognizing the limitations of traditional CRC methods,a new method is proposed for estimating the spatial and temporal distribution of maize residue cover(MRC)in the Jilin Province,NE China.The method used random forest(RF)algorithms,13 tillage indices and 9 textural feature indicators derived from Sentinel-2 data.The tillage indices with the best predictive performance were STI and NDTI(R^(2) of 0.85 and 0.84,respectively).Among the texture features,the bestfitting was Band8AMean-5*5(R^(2) of 0.56 and 0.54 for the line-transect and photographic methods,respectively).Based on MSE and InNodePurity,the optimal combination of RF algorithm for the linetransect method was STI,NDTI,NDI7,NDRI5,SRNDI,NDRI6,NDRI7 and Band3Mean-3*3.Likewise,the optimal combination of RF algorithm for the photographic method was STI,NDTI,NDI7,SRNDI,NDRI6,NDRI5,NDRI9 and Band3Mean-3*3.Regional distribution of MRC in the Jilin Province,estimated using the RF prediction model,was higher in the central and southeast sections than in the northwest.That distribution was in line with the spatial heterogeneity of maize yield in the region.These findings showed that the RF algorithm can be used to map regional MRC and,therefore,represents a useful tool for monitoring regional-scale adoption of conservation agricultural practices.展开更多
Every second, a large volume of useful data is created in social media about the various kind of online purchases and in another forms of reviews. Particularly, purchased products review data is enormously growing in ...Every second, a large volume of useful data is created in social media about the various kind of online purchases and in another forms of reviews. Particularly, purchased products review data is enormously growing in different database repositories every day. Most of the review data are useful to new customers for theier further purchases as well as existing companies to view customers feedback about various products. Data Mining and Machine Leaning techniques are familiar to analyse such kind of data to visualise and know the potential use of the purchased items through online. The customers are making quality of products through their sentiments about the purchased items from different online companies. In this research work, it is analysed sentiments of Headphone review data, which is collected from online repositories. For the analysis of Headphone review data, some of the Machine Learning techniques like Support Vector Machines, Naive Bayes, Decision Trees and Random Forest Algorithms and a Hybrid method are applied to find the quality via the customers’ sentiments. The accuracy and performance of the taken algorithms are also analysed based on the three types of sentiments such as positive, negative and neutral.展开更多
Multiscalar topography influence on soil distribution has a complex pattern that is related to overlay of pedological processes which occurred at different times, and these driving forces are correlated with many geom...Multiscalar topography influence on soil distribution has a complex pattern that is related to overlay of pedological processes which occurred at different times, and these driving forces are correlated with many geomorphologic scales. In this sense, the present study tested the hypothesis whether multiscale geomorphometric generalized covariables can improve pedometric modeling. To achieve this goal, this case study applied the Random Forest algorithm to a multiscale geomorphometric database to predict soil surface attributes. The study area is in phanerozoic sedimentary basins, in the Alter do Ch<span style="white-space:nowrap;">ã</span>o geological formation, Eastern Amazon, Brazil. The multiscale geomorphometric generalization was applied at general and specific geomorphometric covariables, producing groups for each scale combination. The modeling was run using Random Forest for A-horizon thickness, pH, silt and sand content. For model evaluation, visual analysis of digital maps, metrics of forest structures and effect of variables on prediction were used. For evaluation of soil textural classifications, the confusion matrix with a Kappa index, and the user’s and producer’s accuracies were employed. The geomorphometry generalization tends to smooth curvatures and produces identifiable geomorphic representations at sub-watershed and watershed levels. The forest structures and effect of variables on prediction are in agreement with pedological knowledge. The multiscale geomorphometric generalized covariables improved accuracy metrics of soil surface texture classification, with the Kappa Index going from 43% to 62%. Therefore, it can be argued that topography influences soil distribution at combined coarser spatial scales and is able to predict soil particle size contents in the studied watershed. Future development of the multiscale geomorphometric generalization framework could include generalization methods concerning preservation of features, landform classification adaptable at multiple scales.展开更多
Spontaneous combustion of coal increases the temperature in adjoining overburden strata of coal seams and poses a challenge when loading blastholes.This condition,known as hot-hole blasting,is dangerous due to the inc...Spontaneous combustion of coal increases the temperature in adjoining overburden strata of coal seams and poses a challenge when loading blastholes.This condition,known as hot-hole blasting,is dangerous due to the increased possibility of premature explosions in loaded blastholes.Thus,it is crucial to load the blastholes with an appropriate amount of explosives within a short period to avoid premature detonation caused by high temperatures of blastholes.Additionally,it will help achieve the desired fragment size.This study tried to ascertain the most influencial variables of mean fragment size and their optimum values adopted for blasting in a fiery seam.Data on blast design,rock mass,and fragmentation of 100 blasts in fiery seams of a coal mine were collected and used to develop mean fragmentation prediction models using soft computational techniques.The coefficient of determination(R^(2)),root mean square error(RMSE),mean absolute error(MAE),mean square error(MSE),variance account for(VAF)and coefficient of efficiency in percentage(CE)were calculated to validate the results.It indicates that the random forest algorithm(RFA)outperforms the artificial neural network(ANN),response surface method(RSM),and decision tree(DT).The values of R^(2),RMSE,MAE,MSE,VAF,and CE for RFA are 0.94,0.034,0.027,0.001,93.58,and 93.01,respectively.Multiple parametric sensitivity analyses(MPSAs)of the input variables showed that the Schmidt hammer rebound number and spacing-to-burden ratio are the most influencial variables for the blast fragment size.The analysis was finally used to define the best blast design variables to achieve optimum fragment size from blasting.The optimum factor values for RFA of S/B,ld/B and ls/ld are 1.03,1.85 and 0.7,respectively.展开更多
Progression of acute respiratory infection(ARI)to pneumonia increases severity and healthcare burden.Limited evidence exists on using machine learning to identify predictors from demographics,clinical,and pathogen det...Progression of acute respiratory infection(ARI)to pneumonia increases severity and healthcare burden.Limited evidence exists on using machine learning to identify predictors from demographics,clinical,and pathogen detection data.This study aimed to identify pneumonia predictors in ARI patients using machine learning methods.This observational study was conducted in Chongqing,China,from September 2023 to April 2024.Outpatients and inpatients with ARI were recruited weekly.A random forest algorithm was used for predictor selection,followed by a logistic regression-based nomogram to analyze the probability of pneumonia.Among the 1,638 patients with ARI,those with pneumonia had higher rates of influenza A virus(IFV-A)(49.2%vs.39.6%),influenza B virus(26.3%vs.18.6%),and respiratory syncytial virus(6.1%vs.1.9%)infection than those without pneumonia.In the subgroup of 79 patients with comprehensive blood tests,pneumonia was positively associated with hemoglobin(130.00 g/L vs.124.00 g/L),blood urea nitrogen(5.73 mmol/L vs.4.85 mmol/L),C-reactive protein(36.10 mg/L vs.25.25 mg/L),procalcitonin(0.11μg/L vs.0.07μg/L),and D-dimer(0.95μg/L vs.0.80μg/L)levels,whereas pneumonia was inversely associated with neutrophils(4.20×10^(9)/L vs.4.76×10^(9)/L),aspartate aminotransferase(22.50 U/L vs.24.00 U/L),and uric acid(280.90μmol/L vs.330.00μmol/L)levels.Elevated D-dimer levels(adjusted odds ratio[aOR]=1.002,95%confidence interval[CI]:1.001-1.004)and IFV-A infection(aOR=9.308,95%CI:2.433-35.606)were significantly associated with increased pneumonia probability.In future clinical practice,particular attention should be given to ARI patients with elevated D-dimer levels and IFV-A infections.展开更多
Lithium plating in lithium-ion batteries(LIBs)is one of the main causes of safety accidents in electric vehicles(EVs).The study of intelligent machine learning-based lithium plating detection and warning algorithms fo...Lithium plating in lithium-ion batteries(LIBs)is one of the main causes of safety accidents in electric vehicles(EVs).The study of intelligent machine learning-based lithium plating detection and warning algorithms for LIBs is of great importance.Therefore,this paper proposes an intelligent lithium plating detection and early warning method for LIBs based on the random forest model.This method can accurately detect lithium plating during the charging process of LIBs,and play an early warning role according to the detection results.First,pulse charging experiments of LIBs,including normal and lithium plating charging tests,were completed and validated using in situ characterization methods.Second,the normalized internal resistance from the pulse charging test is used to detect lithium plating in LIBs.Third,a lithium plating feature extraction method is proposed to address the lack of useful lithium plating information for LIBs during the charging process.Finally,the Random Forest machine learning technique is used to classify and predict the lithium plating of LIBs.The model validation results show that the detection accuracy of lithium plating is greater than 97.2%.This is of significance for the study of intelligent lithium plating detection algorithms for LIBs.展开更多
The Ms8.0 Wenchuan earthquake of 2008 dramatically changed the terrain surface and caused long-term increases in the scale and frequency of landslides and debris flows.The changing trend of landslides in the earthquak...The Ms8.0 Wenchuan earthquake of 2008 dramatically changed the terrain surface and caused long-term increases in the scale and frequency of landslides and debris flows.The changing trend of landslides in the earthquake-affected area over the decade since the earthquake remains largely unknown.In this study,we were able to address this issue using supervised classification methods and multitemporal remote sensing images to study landslide evolution in the worst-affected area(Mianyuan River Basin)over a period of ten years.Satellite images were processed using the maximum likelihood method and random forest algorithm to automatically map landslide occurrence from 2007 to 2018.The principal findings are as follows:(1)when compared with visual image analysis,the random forest algorithm had a good average accuracy rate of 87%for landslide identification;(2)postevent landslide occurrence has generally decreased with time,but heavy monsoonal seasons have caused temporary spikes in activity;and(3)the postearthquake landslide activity in the Mianyuan River Basin can be divided into a strong activity period(2008 to 2011),medium activity period(2012 to 2016),and weak activity period(post 2017).Landslide activity remains above the prequake level,with damaging events being rare but continuing to occur.Long-term remote sensing and on-site monitoring are required to understand the evolution of landslide activity after strong earthquakes.展开更多
The accurate identification of the oil-paper insulation state of a transformer is crucial for most maintenance strategies.This paper presents a multi-feature comprehensive evaluation model based on combination weighti...The accurate identification of the oil-paper insulation state of a transformer is crucial for most maintenance strategies.This paper presents a multi-feature comprehensive evaluation model based on combination weighting and an improved technique for order of preference by similarity to ideal solution(TOPSIS)method to perform an objective and scientific evaluation of the transformer oil-paper insulation state.Firstly,multiple aging features are extracted from the recovery voltage polarization spectrum and the extended Debye equivalent circuit owing to the limitations of using a single feature for evaluation.A standard evaluation index system is then established by using the collected time-domain dielectric spectrum data.Secondly,this study implements the per-unit value concept to integrate the dimension of the index matrix and calculates the objective weight by using the random forest algorithm.Furthermore,it combines the weighting model to overcome the drawbacks of the single weighting method by using the indicators and considering the subjective experience of experts and the random forest algorithm.Lastly,the enhanced TOPSIS approach is used to determine the insulation quality of an oil-paper transformer.A verification example demonstrates that the evaluation model developed in this study can efficiently and accurately diagnose the insulation status of transformers.Essentially,this study presents a novel approach for the assessment of transformer oil-paper insulation.展开更多
Recent years have witnessed a continuous discovering of new thermoelectric materials which has experienced a paradigm shift from try-and-error efforts to experience-based discovering and first-principles calculation. ...Recent years have witnessed a continuous discovering of new thermoelectric materials which has experienced a paradigm shift from try-and-error efforts to experience-based discovering and first-principles calculation. However, both the experiment and first-principles calculation deriving routes to determine a new compound are time and resources consuming. Here, we demonstrated a machine learning approach to discover new M_(2)X_(3)-type thermoelectric materials with only the composition information. According to the classic Bi_(2)Te_(3) material, we constructed an M_(2)X_(3)-type thermoelectric material library with 720 compounds by using isoelectronic substitution, in which only 101 compounds have crystalline structure information in the Inorganic Crystal Structure Database(ICSD) and Materials Project(MP) database. A model based on the random forest(RF) algorithm plus Bayesian optimization was used to explore the underlying principles to determine the crystal structures from the known compounds. The physical properties of constituent elements(such as atomic mass, electronegativity, ionic radius) were used to define the feature of the compounds with a general formula ^(1)M^(2)M^(1)X^(2)X^(3)X(^(1)M +^(2)M:^(1)X +^(2)X+^(3)X = 2:3). The primary goal is to find new thermoelectric materials with the same rhombohedral structure as Bi_(2)Te_(3) by machine learning.The final trained RF model showed a high accuracy of 91% on the prediction of rhombohedral compounds. Finally, we selected four important features to proceed with the polynomial fitting with the prediction results from the RF model and used the acquired polynomial function to make further discoveries outside the pre-defined material library.展开更多
基金supported by the National Natural Science Foundation of China(32273037 and 32102636)the Guangdong Major Project of Basic and Applied Basic Research(2020B0301030007)+4 种基金Laboratory of Lingnan Modern Agriculture Project(NT2021007)the Guangdong Science and Technology Innovation Leading Talent Program(2019TX05N098)the 111 Center(D20008)the double first-class discipline promotion project(2023B10564003)the Department of Education of Guangdong Province(2019KZDXM004 and 2019KCXTD001).
文摘A switch from avian-typeα-2,3 to human-typeα-2,6 receptors is an essential element for the initiation of a pandemic from an avian influenza virus.Some H9N2 viruses exhibit a preference for binding to human-typeα-2,6 receptors.This identifies their potential threat to public health.However,our understanding of the molecular basis for the switch of receptor preference is still limited.In this study,we employed the random forest algorithm to identify the potentially key amino acid sites within hemagglutinin(HA),which are associated with the receptor binding ability of H9N2 avian influenza virus(AIV).Subsequently,these sites were further verified by receptor binding assays.A total of 12 substitutions in the HA protein(N158D,N158S,A160 N,A160D,A160T,T163I,T163V,V190T,V190A,D193 N,D193G,and N231D)were predicted to prefer binding toα-2,6 receptors.Except for the V190T substitution,the other substitutions were demonstrated to display an affinity for preferential binding toα-2,6 receptors by receptor binding assays.Especially,the A160T substitution caused a significant upregulation of immune-response genes and an increased mortality rate in mice.Our findings provide novel insights into understanding the genetic basis of receptor preference of the H9N2 AIV.
基金financially supported by the National Natural Science Foundation of China(No.52174001)the National Natural Science Foundation of China(No.52004064)+1 种基金the Hainan Province Science and Technology Special Fund “Research on Real-time Intelligent Sensing Technology for Closed-loop Drilling of Oil and Gas Reservoirs in Deepwater Drilling”(ZDYF2023GXJS012)Heilongjiang Provincial Government and Daqing Oilfield's first batch of the scientific and technological key project “Research on the Construction Technology of Gulong Shale Oil Big Data Analysis System”(DQYT-2022-JS-750)。
文摘Real-time intelligent lithology identification while drilling is vital to realizing downhole closed-loop drilling. The complex and changeable geological environment in the drilling makes lithology identification face many challenges. This paper studies the problems of difficult feature information extraction,low precision of thin-layer identification and limited applicability of the model in intelligent lithologic identification. The author tries to improve the comprehensive performance of the lithology identification model from three aspects: data feature extraction, class balance, and model design. A new real-time intelligent lithology identification model of dynamic felling strategy weighted random forest algorithm(DFW-RF) is proposed. According to the feature selection results, gamma ray and 2 MHz phase resistivity are the logging while drilling(LWD) parameters that significantly influence lithology identification. The comprehensive performance of the DFW-RF lithology identification model has been verified in the application of 3 wells in different areas. By comparing the prediction results of five typical lithology identification algorithms, the DFW-RF model has a higher lithology identification accuracy rate and F1 score. This model improves the identification accuracy of thin-layer lithology and is effective and feasible in different geological environments. The DFW-RF model plays a truly efficient role in the realtime intelligent identification of lithologic information in closed-loop drilling and has greater applicability, which is worthy of being widely used in logging interpretation.
基金Under the auspices of National Natural Science Foundation of China(No.52079103)。
文摘Precise and timely prediction of crop yields is crucial for food security and the development of agricultural policies.However,crop yield is influenced by multiple factors within complex growth environments.Previous research has paid relatively little attention to the interference of environmental factors and drought on the growth of winter wheat.Therefore,there is an urgent need for more effective methods to explore the inherent relationship between these factors and crop yield,making precise yield prediction increasingly important.This study was based on four type of indicators including meteorological,crop growth status,environmental,and drought index,from October 2003 to June 2019 in Henan Province as the basic data for predicting winter wheat yield.Using the sparrow search al-gorithm combined with random forest(SSA-RF)under different input indicators,accuracy of winter wheat yield estimation was calcu-lated.The estimation accuracy of SSA-RF was compared with partial least squares regression(PLSR),extreme gradient boosting(XG-Boost),and random forest(RF)models.Finally,the determined optimal yield estimation method was used to predict winter wheat yield in three typical years.Following are the findings:1)the SSA-RF demonstrates superior performance in estimating winter wheat yield compared to other algorithms.The best yield estimation method is achieved by four types indicators’composition with SSA-RF)(R^(2)=0.805,RRMSE=9.9%.2)Crops growth status and environmental indicators play significant roles in wheat yield estimation,accounting for 46%and 22%of the yield importance among all indicators,respectively.3)Selecting indicators from October to April of the follow-ing year yielded the highest accuracy in winter wheat yield estimation,with an R^(2)of 0.826 and an RMSE of 9.0%.Yield estimates can be completed two months before the winter wheat harvest in June.4)The predicted performance will be slightly affected by severe drought.Compared with severe drought year(2011)(R^(2)=0.680)and normal year(2017)(R^(2)=0.790),the SSA-RF model has higher prediction accuracy for wet year(2018)(R^(2)=0.820).This study could provide an innovative approach for remote sensing estimation of winter wheat yield.yield.
基金supported by a project entitled Loess Plateau Region-Watershed-Slope Geological Hazard Multi-Scale Collaborative Intelligent Early Warning System of the National Key R&D Program of China(2022YFC3003404)a project of the Shaanxi Youth Science and Technology Star(2021KJXX-87)public welfare geological survey projects of Shaanxi Institute of Geologic Survey(20180301,201918,202103,and 202413).
文摘This study investigated the impacts of random negative training datasets(NTDs)on the uncertainty of machine learning models for geologic hazard susceptibility assessment of the Loess Plateau,northern Shaanxi Province,China.Based on randomly generated 40 NTDs,the study developed models for the geologic hazard susceptibility assessment using the random forest algorithm and evaluated their performances using the area under the receiver operating characteristic curve(AUC).Specifically,the means and standard deviations of the AUC values from all models were then utilized to assess the overall spatial correlation between the conditioning factors and the susceptibility assessment,as well as the uncertainty introduced by the NTDs.A risk and return methodology was thus employed to quantify and mitigate the uncertainty,with log odds ratios used to characterize the susceptibility assessment levels.The risk and return values were calculated based on the standard deviations and means of the log odds ratios of various locations.After the mean log odds ratios were converted into probability values,the final susceptibility map was plotted,which accounts for the uncertainty induced by random NTDs.The results indicate that the AUC values of the models ranged from 0.810 to 0.963,with an average of 0.852 and a standard deviation of 0.035,indicating encouraging prediction effects and certain uncertainty.The risk and return analysis reveals that low-risk and high-return areas suggest lower standard deviations and higher means across multiple model-derived assessments.Overall,this study introduces a new framework for quantifying the uncertainty of multiple training and evaluation models,aimed at improving their robustness and reliability.Additionally,by identifying low-risk and high-return areas,resource allocation for geologic hazard prevention and control can be optimized,thus ensuring that limited resources are directed toward the most effective prevention and control measures.
基金support in providing the data and the Universiti Teknologi Malaysia supported this work under UTM Flagship CoE/RG-Coe/RG 5.2:Evaluating Surface PGA with Global Ground Motion Site Response Analyses for the highest seismic activity location in Peninsular Malaysia(Q.J130000.5022.10G47)Universiti Teknologi Malaysia-Earthquake Hazard Assessment in Peninsular Malaysia Using Probabilistic Seismic Hazard Analysis(PSHA)Method(Q.J130000.21A2.06E9).
文摘The prediction of slope stability is a complex nonlinear problem.This paper proposes a new method based on the random forest(RF)algorithm to study the rocky slopes stability.Taking the Bukit Merah,Perak and Twin Peak(Kuala Lumpur)as the study area,the slope characteristics of geometrical parameters are obtained from a multidisciplinary approach(consisting of geological,geotechnical,and remote sensing analyses).18 factors,including rock strength,rock quality designation(RQD),joint spacing,continuity,openness,roughness,filling,weathering,water seepage,temperature,vegetation index,water index,and orientation,are selected to construct model input variables while the factor of safety(FOS)functions as an output.The area under the curve(AUC)value of the receiver operating characteristic(ROC)curve is obtained with precision and accuracy and used to analyse the predictive model ability.With a large training set and predicted parameters,an area under the ROC curve(the AUC)of 0.95 is achieved.A precision score of 0.88 is obtained,indicating that the model has a low false positive rate and correctly identifies a substantial number of true positives.The findings emphasise the importance of using a variety of terrain characteristics and different approaches to characterise the rock slope.
文摘The aim of this study is to evaluate the ability of the random forest algorithm that combines data on transrectal ultrasound findings, age, and serum levels of prostate-specific antigen to predict prostate carcinoma. Clinico-demographic data were analyzed for 941 patients with prostate diseases treated at our hospital, including age, serum prostate-specific antigen levels, transrectal ultrasound findings, and pathology diagnosis based on ultrasound-guided needle biopsy of the prostate. These data were compared between patients with and without prostate cancer using the Chi-square test, and then entered into the random forest model to predict diagnosis. Patients with and without prostate cancer differed significantly in age and serum prostate-specific antigen levels (P 〈 0.001), as well as in all transrectal ultrasound characteristics (P 〈 0.05) except uneven echo (P = 0.609). The random forest model based on age, prostate-specific antigen and ultrasound predicted prostate cancer with an accuracy of 83.10%, sensitivity of 65.64%, and specificity of 93.83%. Positive predictive value was 86.72%, and negative predictive value was 81.64%. By integrating age, prostate-specific antigen levels and transrectal ultrasound findings, the random forest algorithm shows better diagnostic performance for prostate cancer than either diagnostic indicator on its own. This algorithm may help improve diagnosis of the disease by identifying patients at high risk for biopsy.
基金supported by the Major Program of the National Natural Science Foundation of China(No.32192434)the Fundamental Research Funds of Chinese Academy of Forestry(No.CAFYBB2019ZD001)the National Key Research and Development Program of China(2016YFD060020602).
文摘Estimating the volume growth of forest ecosystems accurately is important for understanding carbon sequestration and achieving carbon neutrality goals.However,the key environmental factors affecting volume growth differ across various scales and plant functional types.This study was,therefore,conducted to estimate the volume growth of Larix and Quercus forests based on national-scale forestry inventory data in China and its influencing factors using random forest algorithms.The results showed that the model performances of volume growth in natural forests(R^(2)=0.65 for Larix and 0.66 for Quercus,respectively)were better than those in planted forests(R^(2)=0.44 for Larix and 0.40 for Quercus,respectively).In both natural and planted forests,the stand age showed a strong relative importance for volume growth(8.6%–66.2%),while the edaphic and climatic variables had a limited relative importance(<6.0%).The relationship between stand age and volume growth was unimodal in natural forests and linear increase in planted Quercus forests.And the specific locations(i.e.,altitude and aspect)of sampling plots exhibited high relative importance for volume growth in planted forests(4.1%–18.2%).Altitude positively affected volume growth in planted Larix forests but controlled volume growth negatively in planted Quercus forests.Similarly,the effects of other environmental factors on volume growth also differed in both stand origins(planted versus natural)and plant functional types(Larix versus Quercus).These results highlighted that the stand age was the most important predictor for volume growth and there were diverse effects of environmental factors on volume growth among stand origins and plant functional types.Our findings will provide a good framework for site-specific recommendations regarding the management practices necessary to maintain the volume growth in China's forest ecosystems.
文摘This paper presents a new framework for object-based classification of high-resolution hyperspectral data.This multi-step framework is based on multi-resolution segmentation(MRS)and Random Forest classifier(RFC)algorithms.The first step is to determine of weights of the input features while using the object-based approach with MRS to processing such images.Given the high number of input features,an automatic method is needed for estimation of this parameter.Moreover,we used the Variable Importance(VI),one of the outputs of the RFC,to determine the importance of each image band.Then,based on this parameter and other required parameters,the image is segmented into some homogenous regions.Finally,the RFC is carried out based on the characteristics of segments for converting them into meaningful objects.The proposed method,as well as,the conventional pixel-based RFC and Support Vector Machine(SVM)method was applied to three different hyperspectral data-sets with various spectral and spatial characteristics.These data were acquired by the HyMap,the Airborne Prism Experiment(APEX),and the Compact Airborne Spectrographic Imager(CASI)hyperspectral sensors.The experimental results show that the proposed method is more consistent for land cover mapping in various areas.The overall classification accuracy(OA),obtained by the proposed method was 95.48,86.57,and 84.29%for the HyMap,the APEX,and the CASI datasets,respectively.Moreover,this method showed better efficiency in comparison to the spectralbased classifications because the OAs of the proposed method was 5.67 and 3.75%higher than the conventional RFC and SVM classifiers,respectively.
基金Supported by Basic and Applied Basic Research Project of Guangdong Province(2021B0301030006)。
文摘The random forest algorithm was applied to study the nuclear binding energy and charge radius.The regularized root-mean-square of error(RMSE)was proposed to avoid overfitting during the training of random forest.RMSE for nuclides with Z,N>7 is reduced to 0.816 MeV and 0.0200 fm compared with the six-term liquid drop model and a three-term nuclear charge radius formula,respectively.Specific interest is in the possible(sub)shells among the superheavy region,which is important for searching for new elements and the island of stability.The significance of shell features estimated by the so-called shapely additive explanation method suggests(Z,N)=(92,142)and(98,156)as possible subshells indicated by the binding energy.Because the present observed data is far from the N=184 shell,which is suggested by mean-field investigations,its shell effect is not predicted based on present training.The significance analysis of the nuclear charge radius suggests Z=92 and N=136 as possible subshells.The effect is verified by the shell-corrected nuclear charge radius model.
文摘Given the challenge of estimating or calculating quantities of waste electrical and electronic equipment(WEEE)in developing countries,this article focuses on predicting the WEEE generated by Cameroonian small and medium enterprises(SMEs)that are engaged in ISO 14001:2015 initiatives and consume electrical and electronic equipment(EEE)to enhance their performance and profitability.The methodology employed an exploratory approach involving the application of general equilibrium theory(GET)to contextualize the study and generate relevant parameters for deploying the random forest regression learning algorithm for predictions.Machine learning was applied to 80%of the samples for training,while simulation was conducted on the remaining 20%of samples based on quantities of EEE utilized over a specific period,utilization rates,repair rates,and average lifespans.The results demonstrate that the model’s predicted values are significantly close to the actual quantities of generated WEEE,and the model’s performance was evaluated using the mean squared error(MSE)and yielding satisfactory results.Based on this model,both companies and stakeholders can set realistic objectives for managing companies’WEEE,fostering sustainable socio-environmental practices.
基金supported by the National Natural Science Foundation of China(Nos.21933006 and 21773124)the Fundamental Research Funds for the Central Universities of Nankai University(Nos.63243091 and 63233001)the Supercomputing Center of Nankai University(NKSC).
文摘In materials science,data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates.Symbolic regression is a key to extracting material descriptors from large datasets,in particular the Sure Independence Screening and Sparsifying Operator(SISSO)method.While SISSO needs to store the entire expression space to impose heavy memory demands,it limits the performance in complex problems.To address this issue,we propose a RF-SISSO algorithm by combining Random Forests(RF)with SISSO.In this algorithm,the Random Forests algorithm is used for prescreening,capturing non-linear relationships and improving feature selection,which may enhance the quality of the input data and boost the accuracy and efficiency on regression and classification tasks.For a testing on the SISSO’s verification problem for 299 materials,RF-SISSO demonstrates its robust performance and high accuracy.RF-SISSO can maintain the testing accuracy above 0.9 across all four training sample sizes and significantly enhancing regression efficiency,especially in training subsets with smaller sample sizes.For the training subset with 45 samples,the efficiency of RF-SISSO was 265 times higher than that of original SISSO.As collecting large datasets would be both costly and time-consuming in the practical experiments,it is thus believed that RF-SISSO may benefit scientific researches by offering a high predicting accuracy with limited data efficiently.
基金jointly supported by the National Key Research and Development Program of China(2021YFD1500103)the Science and Technology Project for Black Soil Granary(XDA28080500)the National Science&Technology Fundamental Resources Investigation Program of China(2018FY100300).
文摘The return of crop residues to cultivated fields has numerous agronomic and soil quality benefits and,therefore,the areal extent of crop residue cover(CRC)could provide a rapid measure of the sustainability of agricultural production systems in a region.Recognizing the limitations of traditional CRC methods,a new method is proposed for estimating the spatial and temporal distribution of maize residue cover(MRC)in the Jilin Province,NE China.The method used random forest(RF)algorithms,13 tillage indices and 9 textural feature indicators derived from Sentinel-2 data.The tillage indices with the best predictive performance were STI and NDTI(R^(2) of 0.85 and 0.84,respectively).Among the texture features,the bestfitting was Band8AMean-5*5(R^(2) of 0.56 and 0.54 for the line-transect and photographic methods,respectively).Based on MSE and InNodePurity,the optimal combination of RF algorithm for the linetransect method was STI,NDTI,NDI7,NDRI5,SRNDI,NDRI6,NDRI7 and Band3Mean-3*3.Likewise,the optimal combination of RF algorithm for the photographic method was STI,NDTI,NDI7,SRNDI,NDRI6,NDRI5,NDRI9 and Band3Mean-3*3.Regional distribution of MRC in the Jilin Province,estimated using the RF prediction model,was higher in the central and southeast sections than in the northwest.That distribution was in line with the spatial heterogeneity of maize yield in the region.These findings showed that the RF algorithm can be used to map regional MRC and,therefore,represents a useful tool for monitoring regional-scale adoption of conservation agricultural practices.
文摘Every second, a large volume of useful data is created in social media about the various kind of online purchases and in another forms of reviews. Particularly, purchased products review data is enormously growing in different database repositories every day. Most of the review data are useful to new customers for theier further purchases as well as existing companies to view customers feedback about various products. Data Mining and Machine Leaning techniques are familiar to analyse such kind of data to visualise and know the potential use of the purchased items through online. The customers are making quality of products through their sentiments about the purchased items from different online companies. In this research work, it is analysed sentiments of Headphone review data, which is collected from online repositories. For the analysis of Headphone review data, some of the Machine Learning techniques like Support Vector Machines, Naive Bayes, Decision Trees and Random Forest Algorithms and a Hybrid method are applied to find the quality via the customers’ sentiments. The accuracy and performance of the taken algorithms are also analysed based on the three types of sentiments such as positive, negative and neutral.
文摘Multiscalar topography influence on soil distribution has a complex pattern that is related to overlay of pedological processes which occurred at different times, and these driving forces are correlated with many geomorphologic scales. In this sense, the present study tested the hypothesis whether multiscale geomorphometric generalized covariables can improve pedometric modeling. To achieve this goal, this case study applied the Random Forest algorithm to a multiscale geomorphometric database to predict soil surface attributes. The study area is in phanerozoic sedimentary basins, in the Alter do Ch<span style="white-space:nowrap;">ã</span>o geological formation, Eastern Amazon, Brazil. The multiscale geomorphometric generalization was applied at general and specific geomorphometric covariables, producing groups for each scale combination. The modeling was run using Random Forest for A-horizon thickness, pH, silt and sand content. For model evaluation, visual analysis of digital maps, metrics of forest structures and effect of variables on prediction were used. For evaluation of soil textural classifications, the confusion matrix with a Kappa index, and the user’s and producer’s accuracies were employed. The geomorphometry generalization tends to smooth curvatures and produces identifiable geomorphic representations at sub-watershed and watershed levels. The forest structures and effect of variables on prediction are in agreement with pedological knowledge. The multiscale geomorphometric generalized covariables improved accuracy metrics of soil surface texture classification, with the Kappa Index going from 43% to 62%. Therefore, it can be argued that topography influences soil distribution at combined coarser spatial scales and is able to predict soil particle size contents in the studied watershed. Future development of the multiscale geomorphometric generalization framework could include generalization methods concerning preservation of features, landform classification adaptable at multiple scales.
文摘Spontaneous combustion of coal increases the temperature in adjoining overburden strata of coal seams and poses a challenge when loading blastholes.This condition,known as hot-hole blasting,is dangerous due to the increased possibility of premature explosions in loaded blastholes.Thus,it is crucial to load the blastholes with an appropriate amount of explosives within a short period to avoid premature detonation caused by high temperatures of blastholes.Additionally,it will help achieve the desired fragment size.This study tried to ascertain the most influencial variables of mean fragment size and their optimum values adopted for blasting in a fiery seam.Data on blast design,rock mass,and fragmentation of 100 blasts in fiery seams of a coal mine were collected and used to develop mean fragmentation prediction models using soft computational techniques.The coefficient of determination(R^(2)),root mean square error(RMSE),mean absolute error(MAE),mean square error(MSE),variance account for(VAF)and coefficient of efficiency in percentage(CE)were calculated to validate the results.It indicates that the random forest algorithm(RFA)outperforms the artificial neural network(ANN),response surface method(RSM),and decision tree(DT).The values of R^(2),RMSE,MAE,MSE,VAF,and CE for RFA are 0.94,0.034,0.027,0.001,93.58,and 93.01,respectively.Multiple parametric sensitivity analyses(MPSAs)of the input variables showed that the Schmidt hammer rebound number and spacing-to-burden ratio are the most influencial variables for the blast fragment size.The analysis was finally used to define the best blast design variables to achieve optimum fragment size from blasting.The optimum factor values for RFA of S/B,ld/B and ls/ld are 1.03,1.85 and 0.7,respectively.
基金supported by the Chinese Academy of Medical Sciences Innovation Fund for Medical Sciences(2022-12M-CoV19-004)the Science&Technology Fundamental Resources Investigation Program(2023FY100600)+1 种基金Special Funds for the Basic Research and Development Program of the Central Non-profit Research Institutes of China(2021-RC330-002)China Preventive Medicine Association(CPMA2024CRBFK).
文摘Progression of acute respiratory infection(ARI)to pneumonia increases severity and healthcare burden.Limited evidence exists on using machine learning to identify predictors from demographics,clinical,and pathogen detection data.This study aimed to identify pneumonia predictors in ARI patients using machine learning methods.This observational study was conducted in Chongqing,China,from September 2023 to April 2024.Outpatients and inpatients with ARI were recruited weekly.A random forest algorithm was used for predictor selection,followed by a logistic regression-based nomogram to analyze the probability of pneumonia.Among the 1,638 patients with ARI,those with pneumonia had higher rates of influenza A virus(IFV-A)(49.2%vs.39.6%),influenza B virus(26.3%vs.18.6%),and respiratory syncytial virus(6.1%vs.1.9%)infection than those without pneumonia.In the subgroup of 79 patients with comprehensive blood tests,pneumonia was positively associated with hemoglobin(130.00 g/L vs.124.00 g/L),blood urea nitrogen(5.73 mmol/L vs.4.85 mmol/L),C-reactive protein(36.10 mg/L vs.25.25 mg/L),procalcitonin(0.11μg/L vs.0.07μg/L),and D-dimer(0.95μg/L vs.0.80μg/L)levels,whereas pneumonia was inversely associated with neutrophils(4.20×10^(9)/L vs.4.76×10^(9)/L),aspartate aminotransferase(22.50 U/L vs.24.00 U/L),and uric acid(280.90μmol/L vs.330.00μmol/L)levels.Elevated D-dimer levels(adjusted odds ratio[aOR]=1.002,95%confidence interval[CI]:1.001-1.004)and IFV-A infection(aOR=9.308,95%CI:2.433-35.606)were significantly associated with increased pneumonia probability.In future clinical practice,particular attention should be given to ARI patients with elevated D-dimer levels and IFV-A infections.
基金supported by National Natural Science Foundation of China(NSFC)under the Grant number of 52477216Natural Science Foundation of Shanghai under the Grant number of 23ZR1444600in part by the National Natural Science Foundation of China(NSFC)under the Grant number of 52277222.
文摘Lithium plating in lithium-ion batteries(LIBs)is one of the main causes of safety accidents in electric vehicles(EVs).The study of intelligent machine learning-based lithium plating detection and warning algorithms for LIBs is of great importance.Therefore,this paper proposes an intelligent lithium plating detection and early warning method for LIBs based on the random forest model.This method can accurately detect lithium plating during the charging process of LIBs,and play an early warning role according to the detection results.First,pulse charging experiments of LIBs,including normal and lithium plating charging tests,were completed and validated using in situ characterization methods.Second,the normalized internal resistance from the pulse charging test is used to detect lithium plating in LIBs.Third,a lithium plating feature extraction method is proposed to address the lack of useful lithium plating information for LIBs during the charging process.Finally,the Random Forest machine learning technique is used to classify and predict the lithium plating of LIBs.The model validation results show that the detection accuracy of lithium plating is greater than 97.2%.This is of significance for the study of intelligent lithium plating detection algorithms for LIBs.
基金financially supported by the National Key R&D Program(No.2018YFC1505402)the Key Research and Development Program of Sichuan Province(No.2023YFS0435)+1 种基金the State Key Laboratory of Geohazard Prevention and Geoenvironment Protection Independent Research Project(No.SKLGP2014Z004)the Science and Technology Innovation Fund of Sichuan Earthquake Agency(No.201901)。
文摘The Ms8.0 Wenchuan earthquake of 2008 dramatically changed the terrain surface and caused long-term increases in the scale and frequency of landslides and debris flows.The changing trend of landslides in the earthquake-affected area over the decade since the earthquake remains largely unknown.In this study,we were able to address this issue using supervised classification methods and multitemporal remote sensing images to study landslide evolution in the worst-affected area(Mianyuan River Basin)over a period of ten years.Satellite images were processed using the maximum likelihood method and random forest algorithm to automatically map landslide occurrence from 2007 to 2018.The principal findings are as follows:(1)when compared with visual image analysis,the random forest algorithm had a good average accuracy rate of 87%for landslide identification;(2)postevent landslide occurrence has generally decreased with time,but heavy monsoonal seasons have caused temporary spikes in activity;and(3)the postearthquake landslide activity in the Mianyuan River Basin can be divided into a strong activity period(2008 to 2011),medium activity period(2012 to 2016),and weak activity period(post 2017).Landslide activity remains above the prequake level,with damaging events being rare but continuing to occur.Long-term remote sensing and on-site monitoring are required to understand the evolution of landslide activity after strong earthquakes.
基金supported by the Natural Science Foundation of the Fujian Province(2021J01109).
文摘The accurate identification of the oil-paper insulation state of a transformer is crucial for most maintenance strategies.This paper presents a multi-feature comprehensive evaluation model based on combination weighting and an improved technique for order of preference by similarity to ideal solution(TOPSIS)method to perform an objective and scientific evaluation of the transformer oil-paper insulation state.Firstly,multiple aging features are extracted from the recovery voltage polarization spectrum and the extended Debye equivalent circuit owing to the limitations of using a single feature for evaluation.A standard evaluation index system is then established by using the collected time-domain dielectric spectrum data.Secondly,this study implements the per-unit value concept to integrate the dimension of the index matrix and calculates the objective weight by using the random forest algorithm.Furthermore,it combines the weighting model to overcome the drawbacks of the single weighting method by using the indicators and considering the subjective experience of experts and the random forest algorithm.Lastly,the enhanced TOPSIS approach is used to determine the insulation quality of an oil-paper transformer.A verification example demonstrates that the evaluation model developed in this study can efficiently and accurately diagnose the insulation status of transformers.Essentially,this study presents a novel approach for the assessment of transformer oil-paper insulation.
基金the National Key Research and Development Program of China (No. 2018YFB0703600)Shenzhen Key Projects of Long-Term Support Plan (No. 20200925164021002)。
文摘Recent years have witnessed a continuous discovering of new thermoelectric materials which has experienced a paradigm shift from try-and-error efforts to experience-based discovering and first-principles calculation. However, both the experiment and first-principles calculation deriving routes to determine a new compound are time and resources consuming. Here, we demonstrated a machine learning approach to discover new M_(2)X_(3)-type thermoelectric materials with only the composition information. According to the classic Bi_(2)Te_(3) material, we constructed an M_(2)X_(3)-type thermoelectric material library with 720 compounds by using isoelectronic substitution, in which only 101 compounds have crystalline structure information in the Inorganic Crystal Structure Database(ICSD) and Materials Project(MP) database. A model based on the random forest(RF) algorithm plus Bayesian optimization was used to explore the underlying principles to determine the crystal structures from the known compounds. The physical properties of constituent elements(such as atomic mass, electronegativity, ionic radius) were used to define the feature of the compounds with a general formula ^(1)M^(2)M^(1)X^(2)X^(3)X(^(1)M +^(2)M:^(1)X +^(2)X+^(3)X = 2:3). The primary goal is to find new thermoelectric materials with the same rhombohedral structure as Bi_(2)Te_(3) by machine learning.The final trained RF model showed a high accuracy of 91% on the prediction of rhombohedral compounds. Finally, we selected four important features to proceed with the polynomial fitting with the prediction results from the RF model and used the acquired polynomial function to make further discoveries outside the pre-defined material library.