A switch from avian-typeα-2,3 to human-typeα-2,6 receptors is an essential element for the initiation of a pandemic from an avian influenza virus.Some H9N2 viruses exhibit a preference for binding to human-typeα-2,...A switch from avian-typeα-2,3 to human-typeα-2,6 receptors is an essential element for the initiation of a pandemic from an avian influenza virus.Some H9N2 viruses exhibit a preference for binding to human-typeα-2,6 receptors.This identifies their potential threat to public health.However,our understanding of the molecular basis for the switch of receptor preference is still limited.In this study,we employed the random forest algorithm to identify the potentially key amino acid sites within hemagglutinin(HA),which are associated with the receptor binding ability of H9N2 avian influenza virus(AIV).Subsequently,these sites were further verified by receptor binding assays.A total of 12 substitutions in the HA protein(N158D,N158S,A160 N,A160D,A160T,T163I,T163V,V190T,V190A,D193 N,D193G,and N231D)were predicted to prefer binding toα-2,6 receptors.Except for the V190T substitution,the other substitutions were demonstrated to display an affinity for preferential binding toα-2,6 receptors by receptor binding assays.Especially,the A160T substitution caused a significant upregulation of immune-response genes and an increased mortality rate in mice.Our findings provide novel insights into understanding the genetic basis of receptor preference of the H9N2 AIV.展开更多
Prelaunch rolling of maritime rockets threatens the reliability of launch in rough sea conditions.In order to suppress the prelaunch rolling,this study introduces advanced smart prediction designed especially for mari...Prelaunch rolling of maritime rockets threatens the reliability of launch in rough sea conditions.In order to suppress the prelaunch rolling,this study introduces advanced smart prediction designed especially for maritime rockets.The suggested approach introduces a hybrid model that combines random forest(RF)and Adaptive boosting(Ada Boost)methods to describe the coupling mechanism of factors affecting rocket rolling and to suppress the rolling.This combination improves forecast accuracy.Thereafter,the dimensionality reduced response surfaces are used to visually present the coupling between rocket rolling and influencing factors,which reveals the prelaunch rolling mechanism.When angle between the launch device and the ship's bow is within 80°-100°,the dynamic friction coefficient between adapters and guideways is 0.4,and the dynamic friction coefficient between the rocket and launchpad is within 0-0.15 or0.5-0.7,the prelaunch rolling of rocket during one motion cycle of the ship is less than 0.065°,originally 0.27°,reduced by 75.93%,effectively suppressing the prelaunch rolling.This study improves the prelaunch stability of maritime rockets in rough sea conditions and establishes a mapping relationship between the factors affecting rocket rolling and the structure of the sea launch system,guiding the optimization of future sea launch systems.展开更多
Stand age plays a crucial role in forest biomass estimation and carbon cycle modeling.Assessing the uncertainty of stand age prediction models and identifying the key driving factors in the modeling process have becom...Stand age plays a crucial role in forest biomass estimation and carbon cycle modeling.Assessing the uncertainty of stand age prediction models and identifying the key driving factors in the modeling process have become major challenges in forestry research.In this study,we selected the Shaanxi-Gansu-Ningxia region of Northeast China as the research area and utilized multi-source datasets from the summer of 2019 to extract information on spectral,textural,climatic,water balance,and stand characteristics.By integrating the Random Forest(RF)model with Monte Carlo(MC)simulation,we constructed six regression models based on different combina-tions of features and evaluated the uncertainty of each model.Furthermore,we investigated the driving factors influencing stand age modeling by analyzing the effects of different types of features on age inversion.Model performance and accuracy were assessed using the root mean square error(RMSE),mean absolute error(MAE),and the coefficient of determination(R^(2)),while the relative root mean square error(rRMSE)was employed to quantify model uncertainty.The results indicate that the scenarios with more obvious improve-ment in accuracy and effective reduction in uncertainty were Scenario 3 with the inclusion of climate and water balance information(RMSE=25.54 yr,MAE=18.03 yr,R^(2)=0.51,rRMSE=19.17%)and Scenario 5 with the inclusion of stand characterization informa-tion(RMSE=18.47 yr,MAE=13.05 yr,R^(2)=0.74,rRMSE=16.99%).Scenario 6,incorporating all feature types,achieved the highest accuracy(RMSE=17.60 yr,MAE=12.06 yr,R^(2)=0.77,rRMSE=14.19%).In this study,elevation,minimum temperature,and diameter at breast height(DBH)emerged as the key drivers of stand-age modeling.The proposed method can be used to identify drivers and to quantify uncertainty in stand-age estimation,providing a useful reference for improving model accuracy and uncertainty assessment.展开更多
As an important non-ferrous metal structural material most used in industry and production,aluminum(Al) alloy shows its great value in the national economy and industrial manufacturing.How to classify Al alloy rapidly...As an important non-ferrous metal structural material most used in industry and production,aluminum(Al) alloy shows its great value in the national economy and industrial manufacturing.How to classify Al alloy rapidly and accurately is a significant, popular and meaningful task.Classification methods based on laser-induced breakdown spectroscopy(LIBS) have been reported in recent years. Although LIBS is an advanced detection technology, it is necessary to combine it with some algorithm to reach the goal of rapid and accurate classification. As an important machine learning method, the random forest(RF) algorithm plays a great role in pattern recognition and material classification. This paper introduces a rapid classification method of Al alloy based on LIBS and the RF algorithm. The results show that the best accuracy that can be reached using this method to classify Al alloy samples is 98.59%, the average of which is 98.45%. It also reveals through the relationship laws that the accuracy varies with the number of trees in the RF and the size of the training sample set in the RF. According to the laws, researchers can find out the optimized parameters in the RF algorithm in order to achieve,as expected, a good result. These results prove that LIBS with the RF algorithm can exactly classify Al alloy effectively, precisely and rapidly with high accuracy, which obviously has significant practical value.展开更多
The aim of this study is to evaluate the ability of the random forest algorithm that combines data on transrectal ultrasound findings, age, and serum levels of prostate-specific antigen to predict prostate carcinoma. ...The aim of this study is to evaluate the ability of the random forest algorithm that combines data on transrectal ultrasound findings, age, and serum levels of prostate-specific antigen to predict prostate carcinoma. Clinico-demographic data were analyzed for 941 patients with prostate diseases treated at our hospital, including age, serum prostate-specific antigen levels, transrectal ultrasound findings, and pathology diagnosis based on ultrasound-guided needle biopsy of the prostate. These data were compared between patients with and without prostate cancer using the Chi-square test, and then entered into the random forest model to predict diagnosis. Patients with and without prostate cancer differed significantly in age and serum prostate-specific antigen levels (P 〈 0.001), as well as in all transrectal ultrasound characteristics (P 〈 0.05) except uneven echo (P = 0.609). The random forest model based on age, prostate-specific antigen and ultrasound predicted prostate cancer with an accuracy of 83.10%, sensitivity of 65.64%, and specificity of 93.83%. Positive predictive value was 86.72%, and negative predictive value was 81.64%. By integrating age, prostate-specific antigen levels and transrectal ultrasound findings, the random forest algorithm shows better diagnostic performance for prostate cancer than either diagnostic indicator on its own. This algorithm may help improve diagnosis of the disease by identifying patients at high risk for biopsy.展开更多
With the development of data age,data quality has become one of the problems that people pay much attention to.As a field of data mining,outlier detection is related to the quality of data.The isolated forest algorith...With the development of data age,data quality has become one of the problems that people pay much attention to.As a field of data mining,outlier detection is related to the quality of data.The isolated forest algorithm is one of the more prominent numerical data outlier detection algorithms in recent years.In the process of constructing the isolation tree by the isolated forest algorithm,as the isolation tree is continuously generated,the difference of isolation trees will gradually decrease or even no difference,which will result in the waste of memory and reduced efficiency of outlier detection.And in the constructed isolation trees,some isolation trees cannot detect outlier.In this paper,an improved iForest-based method GA-iForest is proposed.This method optimizes the isolated forest by selecting some better isolation trees according to the detection accuracy and the difference of isolation trees,thereby reducing some duplicate,similar and poor detection isolation trees and improving the accuracy and stability of outlier detection.In the experiment,Ubuntu system and Spark platform are used to build the experiment environment.The outlier datasets provided by ODDS are used as test.According to indicators such as the accuracy,recall rate,ROC curves,AUC and execution time,the performance of the proposed method is evaluated.Experimental results show that the proposed method can not only improve the accuracy and stability of outlier detection,but also reduce the number of isolation trees by 20%-40%compared with the original iForest method.展开更多
基金supported by the National Natural Science Foundation of China(32273037 and 32102636)the Guangdong Major Project of Basic and Applied Basic Research(2020B0301030007)+4 种基金Laboratory of Lingnan Modern Agriculture Project(NT2021007)the Guangdong Science and Technology Innovation Leading Talent Program(2019TX05N098)the 111 Center(D20008)the double first-class discipline promotion project(2023B10564003)the Department of Education of Guangdong Province(2019KZDXM004 and 2019KCXTD001).
文摘A switch from avian-typeα-2,3 to human-typeα-2,6 receptors is an essential element for the initiation of a pandemic from an avian influenza virus.Some H9N2 viruses exhibit a preference for binding to human-typeα-2,6 receptors.This identifies their potential threat to public health.However,our understanding of the molecular basis for the switch of receptor preference is still limited.In this study,we employed the random forest algorithm to identify the potentially key amino acid sites within hemagglutinin(HA),which are associated with the receptor binding ability of H9N2 avian influenza virus(AIV).Subsequently,these sites were further verified by receptor binding assays.A total of 12 substitutions in the HA protein(N158D,N158S,A160 N,A160D,A160T,T163I,T163V,V190T,V190A,D193 N,D193G,and N231D)were predicted to prefer binding toα-2,6 receptors.Except for the V190T substitution,the other substitutions were demonstrated to display an affinity for preferential binding toα-2,6 receptors by receptor binding assays.Especially,the A160T substitution caused a significant upregulation of immune-response genes and an increased mortality rate in mice.Our findings provide novel insights into understanding the genetic basis of receptor preference of the H9N2 AIV.
文摘Prelaunch rolling of maritime rockets threatens the reliability of launch in rough sea conditions.In order to suppress the prelaunch rolling,this study introduces advanced smart prediction designed especially for maritime rockets.The suggested approach introduces a hybrid model that combines random forest(RF)and Adaptive boosting(Ada Boost)methods to describe the coupling mechanism of factors affecting rocket rolling and to suppress the rolling.This combination improves forecast accuracy.Thereafter,the dimensionality reduced response surfaces are used to visually present the coupling between rocket rolling and influencing factors,which reveals the prelaunch rolling mechanism.When angle between the launch device and the ship's bow is within 80°-100°,the dynamic friction coefficient between adapters and guideways is 0.4,and the dynamic friction coefficient between the rocket and launchpad is within 0-0.15 or0.5-0.7,the prelaunch rolling of rocket during one motion cycle of the ship is less than 0.065°,originally 0.27°,reduced by 75.93%,effectively suppressing the prelaunch rolling.This study improves the prelaunch stability of maritime rockets in rough sea conditions and establishes a mapping relationship between the factors affecting rocket rolling and the structure of the sea launch system,guiding the optimization of future sea launch systems.
基金Under the auspices of the Natural Science Foundation of China(No.32371875,32001249)。
文摘Stand age plays a crucial role in forest biomass estimation and carbon cycle modeling.Assessing the uncertainty of stand age prediction models and identifying the key driving factors in the modeling process have become major challenges in forestry research.In this study,we selected the Shaanxi-Gansu-Ningxia region of Northeast China as the research area and utilized multi-source datasets from the summer of 2019 to extract information on spectral,textural,climatic,water balance,and stand characteristics.By integrating the Random Forest(RF)model with Monte Carlo(MC)simulation,we constructed six regression models based on different combina-tions of features and evaluated the uncertainty of each model.Furthermore,we investigated the driving factors influencing stand age modeling by analyzing the effects of different types of features on age inversion.Model performance and accuracy were assessed using the root mean square error(RMSE),mean absolute error(MAE),and the coefficient of determination(R^(2)),while the relative root mean square error(rRMSE)was employed to quantify model uncertainty.The results indicate that the scenarios with more obvious improve-ment in accuracy and effective reduction in uncertainty were Scenario 3 with the inclusion of climate and water balance information(RMSE=25.54 yr,MAE=18.03 yr,R^(2)=0.51,rRMSE=19.17%)and Scenario 5 with the inclusion of stand characterization informa-tion(RMSE=18.47 yr,MAE=13.05 yr,R^(2)=0.74,rRMSE=16.99%).Scenario 6,incorporating all feature types,achieved the highest accuracy(RMSE=17.60 yr,MAE=12.06 yr,R^(2)=0.77,rRMSE=14.19%).In this study,elevation,minimum temperature,and diameter at breast height(DBH)emerged as the key drivers of stand-age modeling.The proposed method can be used to identify drivers and to quantify uncertainty in stand-age estimation,providing a useful reference for improving model accuracy and uncertainty assessment.
基金supported by National High Technology Research and Development Program of China (863 Program. No. 2013AA102402)
文摘As an important non-ferrous metal structural material most used in industry and production,aluminum(Al) alloy shows its great value in the national economy and industrial manufacturing.How to classify Al alloy rapidly and accurately is a significant, popular and meaningful task.Classification methods based on laser-induced breakdown spectroscopy(LIBS) have been reported in recent years. Although LIBS is an advanced detection technology, it is necessary to combine it with some algorithm to reach the goal of rapid and accurate classification. As an important machine learning method, the random forest(RF) algorithm plays a great role in pattern recognition and material classification. This paper introduces a rapid classification method of Al alloy based on LIBS and the RF algorithm. The results show that the best accuracy that can be reached using this method to classify Al alloy samples is 98.59%, the average of which is 98.45%. It also reveals through the relationship laws that the accuracy varies with the number of trees in the RF and the size of the training sample set in the RF. According to the laws, researchers can find out the optimized parameters in the RF algorithm in order to achieve,as expected, a good result. These results prove that LIBS with the RF algorithm can exactly classify Al alloy effectively, precisely and rapidly with high accuracy, which obviously has significant practical value.
文摘The aim of this study is to evaluate the ability of the random forest algorithm that combines data on transrectal ultrasound findings, age, and serum levels of prostate-specific antigen to predict prostate carcinoma. Clinico-demographic data were analyzed for 941 patients with prostate diseases treated at our hospital, including age, serum prostate-specific antigen levels, transrectal ultrasound findings, and pathology diagnosis based on ultrasound-guided needle biopsy of the prostate. These data were compared between patients with and without prostate cancer using the Chi-square test, and then entered into the random forest model to predict diagnosis. Patients with and without prostate cancer differed significantly in age and serum prostate-specific antigen levels (P 〈 0.001), as well as in all transrectal ultrasound characteristics (P 〈 0.05) except uneven echo (P = 0.609). The random forest model based on age, prostate-specific antigen and ultrasound predicted prostate cancer with an accuracy of 83.10%, sensitivity of 65.64%, and specificity of 93.83%. Positive predictive value was 86.72%, and negative predictive value was 81.64%. By integrating age, prostate-specific antigen levels and transrectal ultrasound findings, the random forest algorithm shows better diagnostic performance for prostate cancer than either diagnostic indicator on its own. This algorithm may help improve diagnosis of the disease by identifying patients at high risk for biopsy.
基金supported by the State Grid Liaoning Electric Power Supply CO, LTDthe financial support for the “Key Technology and Application Research of the Self-Service Grid Big Data Governance (No.SGLNXT00YJJS1800110)”
文摘With the development of data age,data quality has become one of the problems that people pay much attention to.As a field of data mining,outlier detection is related to the quality of data.The isolated forest algorithm is one of the more prominent numerical data outlier detection algorithms in recent years.In the process of constructing the isolation tree by the isolated forest algorithm,as the isolation tree is continuously generated,the difference of isolation trees will gradually decrease or even no difference,which will result in the waste of memory and reduced efficiency of outlier detection.And in the constructed isolation trees,some isolation trees cannot detect outlier.In this paper,an improved iForest-based method GA-iForest is proposed.This method optimizes the isolated forest by selecting some better isolation trees according to the detection accuracy and the difference of isolation trees,thereby reducing some duplicate,similar and poor detection isolation trees and improving the accuracy and stability of outlier detection.In the experiment,Ubuntu system and Spark platform are used to build the experiment environment.The outlier datasets provided by ODDS are used as test.According to indicators such as the accuracy,recall rate,ROC curves,AUC and execution time,the performance of the proposed method is evaluated.Experimental results show that the proposed method can not only improve the accuracy and stability of outlier detection,but also reduce the number of isolation trees by 20%-40%compared with the original iForest method.