The advent of Big Data has rendered Machine Learning tasks more intricate as they frequently involve higher-dimensional data.Feature Selection(FS)methods can abate the complexity of the data and enhance the accuracy,g...The advent of Big Data has rendered Machine Learning tasks more intricate as they frequently involve higher-dimensional data.Feature Selection(FS)methods can abate the complexity of the data and enhance the accuracy,generalizability,and interpretability of models.Meta-heuristic algorithms are often utilized for FS tasks due to their low requirements and efficient performance.This paper introduces an augmented Forensic-Based Investigation algorithm(DCFBI)that incorporates a Dynamic Individual Selection(DIS)and crisscross(CC)mechanism to improve the pursuit phase of the FBI.Moreover,a binary version of DCFBI(BDCFBI)is applied to FS.Experiments conducted on IEEE CEC 2017 with other metaheuristics demonstrate that DCFBI surpasses them in search capability.The influence of different mechanisms on the original FBI is analyzed on benchmark functions,while its scalability is verified by comparing it with the original FBI on benchmarks with varied dimensions.BDCFBI is then applied to 18 real datasets from the UCI machine learning database and the Wieslaw dataset to select near-optimal features,which are then compared with six renowned binary metaheuristics.The results show that BDCFBI can be more competitive than similar methods and acquire a subset of features with superior classification accuracy.展开更多
The geostatistical technique of Kriging has extensively been used for the investigation and delineation of soil heavy metal pollution. Kriging is rarely used in practical circumstances, however, because the parameter ...The geostatistical technique of Kriging has extensively been used for the investigation and delineation of soil heavy metal pollution. Kriging is rarely used in practical circumstances, however, because the parameter values are difficult to decide and relatively optimal locations for further sampling are difficult to find. In this study, we used large numbers of assumed actual polluted fields (AAPFs) randomly generated by unconditional simulation (US) to assess the adjusted total fee (ATF), an assessment standard developed for balancing the correct treatment rate (CTR) and total fee (TF), based on a traditional strategy of systematic (or uniform) grid sampling (SGS) and Kriging. We found that a strategy using both SGS and Kriging was more cost-effective than a strategy using only SGS. Next, we used a genetic algorithm (GA) approach to find optimal locations for the additional sampling. We found that the optimized locations for the additional sampling were at the joint districts of polluted areas and unpolluted areas, where abundant SGS data appeared near the threshold value. This strategy was less helpful, however, when the pollution of polluted fields showed no spatial correlation.展开更多
为准确评估监测条件有限的平原河网小流域河水水质演变趋势,预知水质变化情况,利用浙江省台州市南官河2021年6月至2023年6月的水质监测数据,基于贝叶斯优化算法(Bayesian optimization algorithm,BOA)和双向长短期记忆神经网络(bi-direc...为准确评估监测条件有限的平原河网小流域河水水质演变趋势,预知水质变化情况,利用浙江省台州市南官河2021年6月至2023年6月的水质监测数据,基于贝叶斯优化算法(Bayesian optimization algorithm,BOA)和双向长短期记忆神经网络(bi-directional long short-term memory,BiLSTM)建立了地表水水质预测模型。利用箱线图和Spearman秩相关系数挖掘水质的时空分布规律,划定中间河段4个站点为重点研究区域,NH3—N和TP为治理重点。通过BOA和双向信息传递机制优化LSTM超参数和模型结构,结果显示,用BOA-BiLSTM模型预测,未来4 h NH_(3)—N浓度的均方根误差(root mean squared error,RMSE)分别为0.2132,0.3689,0.3327和0.3740;未来4 h TP浓度的RMSE分别为0.0246,0.0321,0.0422和0.0334。二者较基准LSTM模型的预测结果分别提升了15.8%,10.6%,10.6%,17.1%和22.6%,3.6%,14.8%,11.8%。以磨石桥NH_(3)—N浓度为例,对比了时序预测与加入上下游数据后的多变量预测结果,发现时序预测对监测参数较少的平原河网具有更强的适用性和更高的预测精度。同时结合研究区域现场勘查和地块分类情况,指出生活源、污水收集及处理设施不完善、雨污合流应为整治重点。当监测参数有限时,本文模型有助于提升对水质异常的监管水平,为环境执法、水环境治理提供数据支撑。展开更多
为解决传统绿色工地评价方法忽略主观感受、难以直观呈现的问题,选取安全性、美观性、高效性作为评价维度展开绿色工地视觉主观评价的研究,构建了基于TrueSkill算法的评分网站系统以探索视觉主观评价方法的应用思路。所述方法是选取106...为解决传统绿色工地评价方法忽略主观感受、难以直观呈现的问题,选取安全性、美观性、高效性作为评价维度展开绿色工地视觉主观评价的研究,构建了基于TrueSkill算法的评分网站系统以探索视觉主观评价方法的应用思路。所述方法是选取106名受访者对在两个实际工地收集到的1500张图像,从上述三种评价维度进行评价,并利用Trueskill算法进行匹配计分。平均每名受访者完成16组评价测试,获得共4086份评价数据,通过分析Skill值散点图、Skill值频率分布曲线及图像分数变化,证实了基于Trueskill算法的视觉主观评价方法在绿色工地评价的场景下具有充分可信性。此外,上述研究过程中收集的感兴趣区域(Region of Interest,ROI)数据可作为计算机视觉在绿色工地评价应用研究的数据基础,对计算机视觉在绿色工地实时监管方面的研究有参考意义。展开更多
基金supported by Special Fund of Fundamental Scientific Research Business Expense for Higher School of Central Government(ZY20180119)the Natural Science Foundation of Zhejiang Province(LZ22F020005)+1 种基金the Natural Science Foundation of Hebei Province(D2022512001)National Natural Science Foundation of China(42164002,62076185).
文摘The advent of Big Data has rendered Machine Learning tasks more intricate as they frequently involve higher-dimensional data.Feature Selection(FS)methods can abate the complexity of the data and enhance the accuracy,generalizability,and interpretability of models.Meta-heuristic algorithms are often utilized for FS tasks due to their low requirements and efficient performance.This paper introduces an augmented Forensic-Based Investigation algorithm(DCFBI)that incorporates a Dynamic Individual Selection(DIS)and crisscross(CC)mechanism to improve the pursuit phase of the FBI.Moreover,a binary version of DCFBI(BDCFBI)is applied to FS.Experiments conducted on IEEE CEC 2017 with other metaheuristics demonstrate that DCFBI surpasses them in search capability.The influence of different mechanisms on the original FBI is analyzed on benchmark functions,while its scalability is verified by comparing it with the original FBI on benchmarks with varied dimensions.BDCFBI is then applied to 18 real datasets from the UCI machine learning database and the Wieslaw dataset to select near-optimal features,which are then compared with six renowned binary metaheuristics.The results show that BDCFBI can be more competitive than similar methods and acquire a subset of features with superior classification accuracy.
文摘The geostatistical technique of Kriging has extensively been used for the investigation and delineation of soil heavy metal pollution. Kriging is rarely used in practical circumstances, however, because the parameter values are difficult to decide and relatively optimal locations for further sampling are difficult to find. In this study, we used large numbers of assumed actual polluted fields (AAPFs) randomly generated by unconditional simulation (US) to assess the adjusted total fee (ATF), an assessment standard developed for balancing the correct treatment rate (CTR) and total fee (TF), based on a traditional strategy of systematic (or uniform) grid sampling (SGS) and Kriging. We found that a strategy using both SGS and Kriging was more cost-effective than a strategy using only SGS. Next, we used a genetic algorithm (GA) approach to find optimal locations for the additional sampling. We found that the optimized locations for the additional sampling were at the joint districts of polluted areas and unpolluted areas, where abundant SGS data appeared near the threshold value. This strategy was less helpful, however, when the pollution of polluted fields showed no spatial correlation.
文摘为准确评估监测条件有限的平原河网小流域河水水质演变趋势,预知水质变化情况,利用浙江省台州市南官河2021年6月至2023年6月的水质监测数据,基于贝叶斯优化算法(Bayesian optimization algorithm,BOA)和双向长短期记忆神经网络(bi-directional long short-term memory,BiLSTM)建立了地表水水质预测模型。利用箱线图和Spearman秩相关系数挖掘水质的时空分布规律,划定中间河段4个站点为重点研究区域,NH3—N和TP为治理重点。通过BOA和双向信息传递机制优化LSTM超参数和模型结构,结果显示,用BOA-BiLSTM模型预测,未来4 h NH_(3)—N浓度的均方根误差(root mean squared error,RMSE)分别为0.2132,0.3689,0.3327和0.3740;未来4 h TP浓度的RMSE分别为0.0246,0.0321,0.0422和0.0334。二者较基准LSTM模型的预测结果分别提升了15.8%,10.6%,10.6%,17.1%和22.6%,3.6%,14.8%,11.8%。以磨石桥NH_(3)—N浓度为例,对比了时序预测与加入上下游数据后的多变量预测结果,发现时序预测对监测参数较少的平原河网具有更强的适用性和更高的预测精度。同时结合研究区域现场勘查和地块分类情况,指出生活源、污水收集及处理设施不完善、雨污合流应为整治重点。当监测参数有限时,本文模型有助于提升对水质异常的监管水平,为环境执法、水环境治理提供数据支撑。
文摘为解决传统绿色工地评价方法忽略主观感受、难以直观呈现的问题,选取安全性、美观性、高效性作为评价维度展开绿色工地视觉主观评价的研究,构建了基于TrueSkill算法的评分网站系统以探索视觉主观评价方法的应用思路。所述方法是选取106名受访者对在两个实际工地收集到的1500张图像,从上述三种评价维度进行评价,并利用Trueskill算法进行匹配计分。平均每名受访者完成16组评价测试,获得共4086份评价数据,通过分析Skill值散点图、Skill值频率分布曲线及图像分数变化,证实了基于Trueskill算法的视觉主观评价方法在绿色工地评价的场景下具有充分可信性。此外,上述研究过程中收集的感兴趣区域(Region of Interest,ROI)数据可作为计算机视觉在绿色工地评价应用研究的数据基础,对计算机视觉在绿色工地实时监管方面的研究有参考意义。