Driven by rapid technological advancements and economic growth,mineral extraction and metal refining have increased dramatically,generating huge volumes of tailings and mine waste(TMWs).Investigating the morphological...Driven by rapid technological advancements and economic growth,mineral extraction and metal refining have increased dramatically,generating huge volumes of tailings and mine waste(TMWs).Investigating the morphological fractions of heavy metals and metalloids(HMMs)in TMWs is key to evaluating their leaching potential into the environment;however,traditional experiments are time-consuming and labor-intensive.In this study,10 machine learning(ML)algorithms were used and compared for rapidly predicting the morphological fractions of HMMs in TMWs.A dataset comprising 2376 data points was used,with mineral composition,elemental properties,and total concentration used as inputs and concentration of morphological fraction used as output.After grid search optimization,the extra tree model performed the best,achieving coefficient of determination(R2)of 0.946 and 0.942 on the validation and test sets,respectively.Electronegativity was found to have the greatest impact on the morphological fraction.The models’performance was enhanced by applying an ensemble method to the top three optimal ML models,including gradient boosting decision tree,extra trees and categorical boosting.Overall,the proposed framework can accurately predict the concentrations of different morphological fractions of HMMs in TMWs.This approach can minimize detection time,aid in the safe management and recovery of TMWs.展开更多
This paper considers the uniform parallel machine scheduling problem with unequal release dates and delivery times to minimize the maximum completion time.For this NP-hard problem,the largest sum of release date,proce...This paper considers the uniform parallel machine scheduling problem with unequal release dates and delivery times to minimize the maximum completion time.For this NP-hard problem,the largest sum of release date,processing time and delivery time first rule is designed to determine a certain machine for each job,and the largest difference between delivery time and release date first rule is designed to sequence the jobs scheduled on the same machine,and then a novel algorithm for the scheduling problem is built.To evaluate the performance of the proposed algorithm,a lower bound for the problem is proposed.The accuracy of the proposed algorithm is tested based on the data with problem size varying from 200 jobs to 600 jobs.The computational results indicate that the average relative error between the proposed algorithm and the lower bound is only 0.667%,therefore the solutions obtained by the proposed algorithm are very accurate.展开更多
鉴于ROC曲线下面积(Area Under the ROC Curve,AUC)对数据分布的不敏感特性,面向AUC的对抗训练(AdAUC)近来已成为机器学习领域中抵御长尾分布下对抗攻击的有效范式之一。当前主流方法大多遵循基于平方替代损失的AUC对抗训练框架,并将成...鉴于ROC曲线下面积(Area Under the ROC Curve,AUC)对数据分布的不敏感特性,面向AUC的对抗训练(AdAUC)近来已成为机器学习领域中抵御长尾分布下对抗攻击的有效范式之一。当前主流方法大多遵循基于平方替代损失的AUC对抗训练框架,并将成对比较形式的AUC对抗损失重构为一个逐样本的随机鞍点优化问题,克服端到端的计算瓶颈。然而,面向复杂的实际应用场景,基于平方损失设计的AUC对抗训练框架恐难以适应多样的下游任务需求。此外,与传统对抗训练范式类似,面向AUC的对抗训练方法在提高模型对抗鲁棒性的同时,也会降低模型在正常样本上的AUC性能,而目前鲜有针对该问题的有效解决方案。鉴于此,本文对如何构建一般化的高效AUC对抗机器学习范式展开系统研究。首先,提出了一种基于标准化分数扰动的通用AUC对抗训练框架(NSAdAUC),在相对温和的条件下,该框架可通过直接扰动模型对样本的预测得分实现对AUC指标的攻击,且不依赖于特定的AUC替代损失。在此基础上,本文进一步指出鲁棒AUC误差可分解为标准AUC误差和边界AUC误差两项之和,并据此设计了一种基于排序感知对抗正则化的AUC对抗训练框架(RARAdAUC),同时兼顾模型的标准AUC和鲁棒AUC性能。为验证所提框架的有效性,在5个长尾基准数据集上进行了大量实验,结果表明所提NSAdAUC和RARAdAUC框架在多种对抗攻击下的鲁棒性均优于现有方法,可在平均意义上分别产生0.94%、5.52%的标准AUC和5.69%、5.41%的鲁棒AUC性能提升。展开更多
基金Project(2024JJ2074) supported by the Natural Science Foundation of Hunan Province,ChinaProject(22376221) supported by the National Natural Science Foundation of ChinaProject(2023QNRC001) supported by the Young Elite Scientists Sponsorship Program by CAST,China。
文摘Driven by rapid technological advancements and economic growth,mineral extraction and metal refining have increased dramatically,generating huge volumes of tailings and mine waste(TMWs).Investigating the morphological fractions of heavy metals and metalloids(HMMs)in TMWs is key to evaluating their leaching potential into the environment;however,traditional experiments are time-consuming and labor-intensive.In this study,10 machine learning(ML)algorithms were used and compared for rapidly predicting the morphological fractions of HMMs in TMWs.A dataset comprising 2376 data points was used,with mineral composition,elemental properties,and total concentration used as inputs and concentration of morphological fraction used as output.After grid search optimization,the extra tree model performed the best,achieving coefficient of determination(R2)of 0.946 and 0.942 on the validation and test sets,respectively.Electronegativity was found to have the greatest impact on the morphological fraction.The models’performance was enhanced by applying an ensemble method to the top three optimal ML models,including gradient boosting decision tree,extra trees and categorical boosting.Overall,the proposed framework can accurately predict the concentrations of different morphological fractions of HMMs in TMWs.This approach can minimize detection time,aid in the safe management and recovery of TMWs.
基金supported by the National Natural Science Foundation of China (7087103290924021+2 种基金70971035)the National High Technology Research and Development Program of China (863 Program) (2008AA042901)Anhui Provincial Natural Science Foundation (11040606Q27)
文摘This paper considers the uniform parallel machine scheduling problem with unequal release dates and delivery times to minimize the maximum completion time.For this NP-hard problem,the largest sum of release date,processing time and delivery time first rule is designed to determine a certain machine for each job,and the largest difference between delivery time and release date first rule is designed to sequence the jobs scheduled on the same machine,and then a novel algorithm for the scheduling problem is built.To evaluate the performance of the proposed algorithm,a lower bound for the problem is proposed.The accuracy of the proposed algorithm is tested based on the data with problem size varying from 200 jobs to 600 jobs.The computational results indicate that the average relative error between the proposed algorithm and the lower bound is only 0.667%,therefore the solutions obtained by the proposed algorithm are very accurate.
文摘鉴于ROC曲线下面积(Area Under the ROC Curve,AUC)对数据分布的不敏感特性,面向AUC的对抗训练(AdAUC)近来已成为机器学习领域中抵御长尾分布下对抗攻击的有效范式之一。当前主流方法大多遵循基于平方替代损失的AUC对抗训练框架,并将成对比较形式的AUC对抗损失重构为一个逐样本的随机鞍点优化问题,克服端到端的计算瓶颈。然而,面向复杂的实际应用场景,基于平方损失设计的AUC对抗训练框架恐难以适应多样的下游任务需求。此外,与传统对抗训练范式类似,面向AUC的对抗训练方法在提高模型对抗鲁棒性的同时,也会降低模型在正常样本上的AUC性能,而目前鲜有针对该问题的有效解决方案。鉴于此,本文对如何构建一般化的高效AUC对抗机器学习范式展开系统研究。首先,提出了一种基于标准化分数扰动的通用AUC对抗训练框架(NSAdAUC),在相对温和的条件下,该框架可通过直接扰动模型对样本的预测得分实现对AUC指标的攻击,且不依赖于特定的AUC替代损失。在此基础上,本文进一步指出鲁棒AUC误差可分解为标准AUC误差和边界AUC误差两项之和,并据此设计了一种基于排序感知对抗正则化的AUC对抗训练框架(RARAdAUC),同时兼顾模型的标准AUC和鲁棒AUC性能。为验证所提框架的有效性,在5个长尾基准数据集上进行了大量实验,结果表明所提NSAdAUC和RARAdAUC框架在多种对抗攻击下的鲁棒性均优于现有方法,可在平均意义上分别产生0.94%、5.52%的标准AUC和5.69%、5.41%的鲁棒AUC性能提升。