Given the challenge of estimating or calculating quantities of waste electrical and electronic equipment(WEEE)in developing countries,this article focuses on predicting the WEEE generated by Cameroonian small and medi...Given the challenge of estimating or calculating quantities of waste electrical and electronic equipment(WEEE)in developing countries,this article focuses on predicting the WEEE generated by Cameroonian small and medium enterprises(SMEs)that are engaged in ISO 14001:2015 initiatives and consume electrical and electronic equipment(EEE)to enhance their performance and profitability.The methodology employed an exploratory approach involving the application of general equilibrium theory(GET)to contextualize the study and generate relevant parameters for deploying the random forest regression learning algorithm for predictions.Machine learning was applied to 80%of the samples for training,while simulation was conducted on the remaining 20%of samples based on quantities of EEE utilized over a specific period,utilization rates,repair rates,and average lifespans.The results demonstrate that the model’s predicted values are significantly close to the actual quantities of generated WEEE,and the model’s performance was evaluated using the mean squared error(MSE)and yielding satisfactory results.Based on this model,both companies and stakeholders can set realistic objectives for managing companies’WEEE,fostering sustainable socio-environmental practices.展开更多
BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intr...BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.AIM To evaluate the predictive performance of machine learning(ML)algorithms for DCI by comparing three modeling approaches,identify factors influencing DCI,and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.METHODS This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021.Demographic data,past medical history,medication use,and psychological status were collected.The endoscopist assessed DCI using the visual analogue scale.After univariate screening,predictive models were developed using multivariable logistic regression,least absolute shrinkage and selection operator(LASSO)regression,and random forest(RF)algorithms.Model performance was evaluated based on discrimination,calibration,and decision curve analysis(DCA),and results were visualized using nomograms.RESULTS A total of 712 patients(53.8%male;mean age 54.5 years±12.9 years)were included.Logistic regression analysis identified constipation[odds ratio(OR)=2.254,95%confidence interval(CI):1.289-3.931],abdominal circumference(AC)(77.5–91.9 cm,OR=1.895,95%CI:1.065-3.350;AC≥92 cm,OR=1.271,95%CI:0.730-2.188),and anxiety(OR=1.071,95%CI:1.044-1.100)as predictive factors for DCI,validated by LASSO and RF methods.Model performance revealed training/validation sensitivities of 0.826/0.925,0.924/0.868,and 1.000/0.981;specificities of 0.602/0.511,0.510/0.562,and 0.977/0.526;and corresponding area under the receiver operating characteristic curves(AUCs)of 0.780(0.737-0.823)/0.726(0.654-0.799),0.754(0.710-0.798)/0.723(0.656-0.791),and 1.000(1.000-1.000)/0.754(0.688-0.820),respectively.DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37.The RF model demonstrated superior diagnostic accuracy,reflected by perfect training sensitivity(1.000)and highest validation AUC(0.754),outperforming other methods in clinical applicability.CONCLUSION The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models.This approach supports individualized preoperative optimization,enhancing colonoscopy quality through targeted risk stratification.展开更多
Autism spectrum disorder(ASD),classified as a developmental disability,is now more common in children than ever.A drastic increase in the rate of autism spectrum disorder in children worldwide demands early detection ...Autism spectrum disorder(ASD),classified as a developmental disability,is now more common in children than ever.A drastic increase in the rate of autism spectrum disorder in children worldwide demands early detection of autism in children.Parents can seek professional help for a better prognosis of the child’s therapy when ASD is diagnosed under five years.This research study aims to develop an automated tool for diagnosing autism in children.The computer-aided diagnosis tool for ASD detection is designed and developed by a novel methodology that includes data acquisition,feature selection,and classification phases.The most deterministic features are selected from the self-acquired dataset by novel feature selection methods before classification.The Imperialistic competitive algorithm(ICA)based on empires conquering colonies performs feature selection in this study.The performance of Logistic Regression(LR),Decision tree,K-Nearest Neighbor(KNN),and Random Forest(RF)classifiers are experimentally studied in this research work.The experimental results prove that the Logistic regression classifier exhibits the highest accuracy for the self-acquired dataset.The ASD detection is evaluated experimentally with the Least Absolute Shrinkage and Selection Operator(LASSO)feature selection method and different classifiers.The Exploratory Data Analysis(EDA)phase has uncovered crucial facts about the data,like the correlation of the features in the dataset with the class variable.展开更多
目的基于Logistic回归和随机森林算法构建全身麻醉复苏延迟的预判模型并验证。方法选择2021—2023年浙江某三甲医院复苏室收治的1177例全麻患者作为研究对象,按7︰3的比例随机分为训练组和验证组两组,采用Logistic单因素+多因素回归分析...目的基于Logistic回归和随机森林算法构建全身麻醉复苏延迟的预判模型并验证。方法选择2021—2023年浙江某三甲医院复苏室收治的1177例全麻患者作为研究对象,按7︰3的比例随机分为训练组和验证组两组,采用Logistic单因素+多因素回归分析,构建全身麻醉复苏延迟的预判模型并用列线图展示。利用随机森林算法筛选全身麻醉患者复苏延迟的影响因素并按重要性排序。采用受试者操作特征曲线(Receiver operating characteristic curve,ROC)下面积(Area of the under curve,AUC)检验模型的预测效果,采用校准曲线以及决策曲线综合评价模型的预测性能。结果1177例患者复苏延迟发生99例,发生率为8.41%。Logistic回归显示性别、ASA分级、年龄、手术时间、手术种类、输液量是全麻患者复苏延迟的独立危险因素。随机森林算法结果显示复苏延迟各变量的重要性排序为手术种类、年龄、手术时间、输液量、ASA分级、性别。Logistic回归模型的训练组AUC为0.87(95%CI 0.83~0.91),验证组为0.86(95%CI 0.81~0.91)。随机森林模型训练组AUC为0.85(95%CI 0.49~1.00),验证组AUC为0.76(95%CI 0.26~1.00)。提示模型具有良好的区分能力,预测能力较高,具有一定的临床价值。结论手术种类、年龄、手术时间、输液量、ASA分级、性别是全麻患者复苏延迟的独立危险因素,根据此构建预判模型的区分度与校准度较高,有助于预测全麻患者苏醒延迟的发生,可以为临床护理干预措施的制定与实施提供参考。展开更多
In materials science,data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates.Symbolic regression is a key to extracting material descriptors from large datas...In materials science,data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates.Symbolic regression is a key to extracting material descriptors from large datasets,in particular the Sure Independence Screening and Sparsifying Operator(SISSO)method.While SISSO needs to store the entire expression space to impose heavy memory demands,it limits the performance in complex problems.To address this issue,we propose a RF-SISSO algorithm by combining Random Forests(RF)with SISSO.In this algorithm,the Random Forests algorithm is used for prescreening,capturing non-linear relationships and improving feature selection,which may enhance the quality of the input data and boost the accuracy and efficiency on regression and classification tasks.For a testing on the SISSO’s verification problem for 299 materials,RF-SISSO demonstrates its robust performance and high accuracy.RF-SISSO can maintain the testing accuracy above 0.9 across all four training sample sizes and significantly enhancing regression efficiency,especially in training subsets with smaller sample sizes.For the training subset with 45 samples,the efficiency of RF-SISSO was 265 times higher than that of original SISSO.As collecting large datasets would be both costly and time-consuming in the practical experiments,it is thus believed that RF-SISSO may benefit scientific researches by offering a high predicting accuracy with limited data efficiently.展开更多
CO_(2)flooding for enhanced oil recovery(EOR)not only enables underground carbon storage but also plays a critical role in tertiary oil recovery.However,its displacement efficiency is constrained by whether CO_(2)and ...CO_(2)flooding for enhanced oil recovery(EOR)not only enables underground carbon storage but also plays a critical role in tertiary oil recovery.However,its displacement efficiency is constrained by whether CO_(2)and crude oil achieve miscibility,necessitating precise prediction of the minimum miscibility pressure(MMP)for CO_(2)-oil systems.Traditional methods,such as experimental measurements and empirical correlations,face challenges including time-consuming procedures and limited applicability.In contrast,artificial intelligence(AI)algorithms have emerged as superior alternatives due to their efficiency,broad applicability,and high prediction accuracy.This study employs four AI algorithms—Random Forest Regression(RFR),Genetic Algorithm Based Back Propagation Artificial Neural Network(GA-BPNN),Support Vector Regression(SVR),and Gaussian Process Regression(GPR)—to establish predictive models for CO_(2)-oil MMP.A comprehensive database comprising 151 data entries was utilized for model development.The performance of these models was rigorously evaluated using five distinct statistical metrics and visualized comparisons.Validation results confirm their accuracy.Field applications demonstrate that all four models are effective for predicting MMP in ultra-deep reservoirs(burial depth>5000 m)with complex crude oil compositions.Among them,the RFR and GA-BPNN models outperform SVR and GPR,achieving root mean square errors(RMSE)of 0.33%and 2.23%,and average absolute percentage relative errors(AAPRE)of 0.01%and 0.04%,respectively.Sensitivity analysis of MMP-influencing factors reveals that reservoir temperature(T_(R))exerts the most significant impact on MMP,while Xint(mole fraction of intermediate oil components,including C_(2)-C_(4),CO_(2),and H_(2)S)exhibits the least influence.展开更多
文摘Given the challenge of estimating or calculating quantities of waste electrical and electronic equipment(WEEE)in developing countries,this article focuses on predicting the WEEE generated by Cameroonian small and medium enterprises(SMEs)that are engaged in ISO 14001:2015 initiatives and consume electrical and electronic equipment(EEE)to enhance their performance and profitability.The methodology employed an exploratory approach involving the application of general equilibrium theory(GET)to contextualize the study and generate relevant parameters for deploying the random forest regression learning algorithm for predictions.Machine learning was applied to 80%of the samples for training,while simulation was conducted on the remaining 20%of samples based on quantities of EEE utilized over a specific period,utilization rates,repair rates,and average lifespans.The results demonstrate that the model’s predicted values are significantly close to the actual quantities of generated WEEE,and the model’s performance was evaluated using the mean squared error(MSE)and yielding satisfactory results.Based on this model,both companies and stakeholders can set realistic objectives for managing companies’WEEE,fostering sustainable socio-environmental practices.
基金the Chinese Clinical Trial Registry(No.ChiCTR2000040109)approved by the Hospital Ethics Committee(No.20210130017).
文摘BACKGROUND Difficulty of colonoscopy insertion(DCI)significantly affects colonoscopy effectiveness and serves as a key quality indicator.Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.AIM To evaluate the predictive performance of machine learning(ML)algorithms for DCI by comparing three modeling approaches,identify factors influencing DCI,and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.METHODS This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021.Demographic data,past medical history,medication use,and psychological status were collected.The endoscopist assessed DCI using the visual analogue scale.After univariate screening,predictive models were developed using multivariable logistic regression,least absolute shrinkage and selection operator(LASSO)regression,and random forest(RF)algorithms.Model performance was evaluated based on discrimination,calibration,and decision curve analysis(DCA),and results were visualized using nomograms.RESULTS A total of 712 patients(53.8%male;mean age 54.5 years±12.9 years)were included.Logistic regression analysis identified constipation[odds ratio(OR)=2.254,95%confidence interval(CI):1.289-3.931],abdominal circumference(AC)(77.5–91.9 cm,OR=1.895,95%CI:1.065-3.350;AC≥92 cm,OR=1.271,95%CI:0.730-2.188),and anxiety(OR=1.071,95%CI:1.044-1.100)as predictive factors for DCI,validated by LASSO and RF methods.Model performance revealed training/validation sensitivities of 0.826/0.925,0.924/0.868,and 1.000/0.981;specificities of 0.602/0.511,0.510/0.562,and 0.977/0.526;and corresponding area under the receiver operating characteristic curves(AUCs)of 0.780(0.737-0.823)/0.726(0.654-0.799),0.754(0.710-0.798)/0.723(0.656-0.791),and 1.000(1.000-1.000)/0.754(0.688-0.820),respectively.DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37.The RF model demonstrated superior diagnostic accuracy,reflected by perfect training sensitivity(1.000)and highest validation AUC(0.754),outperforming other methods in clinical applicability.CONCLUSION The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models.This approach supports individualized preoperative optimization,enhancing colonoscopy quality through targeted risk stratification.
基金The authors extend their appreciation to the Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number(IF2-PSAU-2022/01/22043)。
文摘Autism spectrum disorder(ASD),classified as a developmental disability,is now more common in children than ever.A drastic increase in the rate of autism spectrum disorder in children worldwide demands early detection of autism in children.Parents can seek professional help for a better prognosis of the child’s therapy when ASD is diagnosed under five years.This research study aims to develop an automated tool for diagnosing autism in children.The computer-aided diagnosis tool for ASD detection is designed and developed by a novel methodology that includes data acquisition,feature selection,and classification phases.The most deterministic features are selected from the self-acquired dataset by novel feature selection methods before classification.The Imperialistic competitive algorithm(ICA)based on empires conquering colonies performs feature selection in this study.The performance of Logistic Regression(LR),Decision tree,K-Nearest Neighbor(KNN),and Random Forest(RF)classifiers are experimentally studied in this research work.The experimental results prove that the Logistic regression classifier exhibits the highest accuracy for the self-acquired dataset.The ASD detection is evaluated experimentally with the Least Absolute Shrinkage and Selection Operator(LASSO)feature selection method and different classifiers.The Exploratory Data Analysis(EDA)phase has uncovered crucial facts about the data,like the correlation of the features in the dataset with the class variable.
文摘目的基于Logistic回归和随机森林算法构建全身麻醉复苏延迟的预判模型并验证。方法选择2021—2023年浙江某三甲医院复苏室收治的1177例全麻患者作为研究对象,按7︰3的比例随机分为训练组和验证组两组,采用Logistic单因素+多因素回归分析,构建全身麻醉复苏延迟的预判模型并用列线图展示。利用随机森林算法筛选全身麻醉患者复苏延迟的影响因素并按重要性排序。采用受试者操作特征曲线(Receiver operating characteristic curve,ROC)下面积(Area of the under curve,AUC)检验模型的预测效果,采用校准曲线以及决策曲线综合评价模型的预测性能。结果1177例患者复苏延迟发生99例,发生率为8.41%。Logistic回归显示性别、ASA分级、年龄、手术时间、手术种类、输液量是全麻患者复苏延迟的独立危险因素。随机森林算法结果显示复苏延迟各变量的重要性排序为手术种类、年龄、手术时间、输液量、ASA分级、性别。Logistic回归模型的训练组AUC为0.87(95%CI 0.83~0.91),验证组为0.86(95%CI 0.81~0.91)。随机森林模型训练组AUC为0.85(95%CI 0.49~1.00),验证组AUC为0.76(95%CI 0.26~1.00)。提示模型具有良好的区分能力,预测能力较高,具有一定的临床价值。结论手术种类、年龄、手术时间、输液量、ASA分级、性别是全麻患者复苏延迟的独立危险因素,根据此构建预判模型的区分度与校准度较高,有助于预测全麻患者苏醒延迟的发生,可以为临床护理干预措施的制定与实施提供参考。
基金supported by the National Natural Science Foundation of China(Nos.21933006 and 21773124)the Fundamental Research Funds for the Central Universities of Nankai University(Nos.63243091 and 63233001)the Supercomputing Center of Nankai University(NKSC).
文摘In materials science,data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates.Symbolic regression is a key to extracting material descriptors from large datasets,in particular the Sure Independence Screening and Sparsifying Operator(SISSO)method.While SISSO needs to store the entire expression space to impose heavy memory demands,it limits the performance in complex problems.To address this issue,we propose a RF-SISSO algorithm by combining Random Forests(RF)with SISSO.In this algorithm,the Random Forests algorithm is used for prescreening,capturing non-linear relationships and improving feature selection,which may enhance the quality of the input data and boost the accuracy and efficiency on regression and classification tasks.For a testing on the SISSO’s verification problem for 299 materials,RF-SISSO demonstrates its robust performance and high accuracy.RF-SISSO can maintain the testing accuracy above 0.9 across all four training sample sizes and significantly enhancing regression efficiency,especially in training subsets with smaller sample sizes.For the training subset with 45 samples,the efficiency of RF-SISSO was 265 times higher than that of original SISSO.As collecting large datasets would be both costly and time-consuming in the practical experiments,it is thus believed that RF-SISSO may benefit scientific researches by offering a high predicting accuracy with limited data efficiently.
文摘CO_(2)flooding for enhanced oil recovery(EOR)not only enables underground carbon storage but also plays a critical role in tertiary oil recovery.However,its displacement efficiency is constrained by whether CO_(2)and crude oil achieve miscibility,necessitating precise prediction of the minimum miscibility pressure(MMP)for CO_(2)-oil systems.Traditional methods,such as experimental measurements and empirical correlations,face challenges including time-consuming procedures and limited applicability.In contrast,artificial intelligence(AI)algorithms have emerged as superior alternatives due to their efficiency,broad applicability,and high prediction accuracy.This study employs four AI algorithms—Random Forest Regression(RFR),Genetic Algorithm Based Back Propagation Artificial Neural Network(GA-BPNN),Support Vector Regression(SVR),and Gaussian Process Regression(GPR)—to establish predictive models for CO_(2)-oil MMP.A comprehensive database comprising 151 data entries was utilized for model development.The performance of these models was rigorously evaluated using five distinct statistical metrics and visualized comparisons.Validation results confirm their accuracy.Field applications demonstrate that all four models are effective for predicting MMP in ultra-deep reservoirs(burial depth>5000 m)with complex crude oil compositions.Among them,the RFR and GA-BPNN models outperform SVR and GPR,achieving root mean square errors(RMSE)of 0.33%and 2.23%,and average absolute percentage relative errors(AAPRE)of 0.01%and 0.04%,respectively.Sensitivity analysis of MMP-influencing factors reveals that reservoir temperature(T_(R))exerts the most significant impact on MMP,while Xint(mole fraction of intermediate oil components,including C_(2)-C_(4),CO_(2),and H_(2)S)exhibits the least influence.