基于知识线记忆的多分类器集成算法被引量：7

Multi Classifier Ensemble Algorithm Based on Knowledge-Line Memory

下载PDF

导出

摘要多分类器系统作为混合智能系统的分支,集成了具有多样性的分类器集合,使整体得到更优的分类性能.结果融合是该领域中的一个重要问题,在相同分类器成员下,好的融合策略可以有效提升系统整体的分类正确率.随着模型安全性得到重视,传统融合策略可解释性差的问题凸显.本文基于心理学中的知识线记忆理论进行建模,参考人类决策过程,提出了一种拥有较好可解释性的启发式多分类器集成算法,称为知识线集成算法.该算法模拟人类学习与推断的行为,组织多分类器结果的融合.在训练中,模型收集给定分类器集合的不同子集,构建不同特征空间到解空间的映射,构成知识线.在推断时,模型启发式地激活知识线,进行选择性结果集成,得到推断结果.知识线集成使用样本驱动的模式,易于进行中间过程与最终结果的分析.以决策树作为分类器的实验表明,在相同的决策树集合下,知识线集成算法分类正确率与随机森林相仿.在此基础之上,知识线集成算法可量化问题不同粒度下的难易程度,且在推断时能提供相关训练样本作为依据. Multi-classifier System,a branch technology of Hybrid Intelligent System,integrates many classifiers to approach higher accuracy.Because of the limitation of computing resource and the quality of classifiers,classifiers fusion is an important problem in Multi-classifier System.Better fusion strategy can reach higher performance of whole Multi-classifier System under the same well-trained classifier members.The traditional methods had tried many fusion strategies such as normal voting,weighted voting and fusion function.As the models developed,the classification accuracy went higher.But these models only paid attention to classification accuracy and paid little attention to interpretability which is an inevitable problem when safety of model was concerned.This paper takes a view of human decision making and presents a new multiclassifier ensemble algorithm named knowledge-line ensemble which based on knowledge-line memory theory describing the process of human decision making with memory.In order to get the interpretability like human decision making,knowledge-line ensemble algorithm imitates the learning and inference processes of human according to the psychological theory description.In training,the model tries to create memory called knowledge-line like human to store memory about solving different problems and forget memory like human in order to avoid sinking into special bad cases.Knowledge-line and training sample are one-to-one correspondence.Knowledge-line is a subset of given well-trained classifiers which can result in right classification on the corresponding sample.Different samples result in creating different knowledge-lines,so after training,the model stores varied knowledge-lines.These knowledge-lines create a set of mappings which are used to map feature space to answer space.In inference,the model chooses a subset of existing knowledge-lines to activate depending on heuristics rules.These active knowledge-lines will work,and vote to get a result.Knowledge-line ensemble algorithm is a kind of sample driven method,when inferring a new case,only the knowledge-lines born with familiar samples will be activated.It seems that human beings think of solution in memory when suffering from troubles.So knowledge-line ensemble algorithm is using sampled data to make decisions.Specially,because the process that the knowledge-line memory theory uses computing units to construct knowledge lines is similar to adding elements to sets,in order to describe the calculation process of the algorithm better,this paper uses matrices to model this process.The connection relationship between the knowledge-lines and the computing units can be represented by an adjacency matrix,the results of different classifiers can be stored by a classification matrix,and the activation of the knowledge-lines can be completed in the form of the inner product of the results of all knowledge-lines and the activation vectors.So the final classification result can be expressed in the form of matrix multiplication.On this basis,the goal and convergence of the algorithm are explained.In the experiments,this paper used decision trees as the given classifiers.Under the same given classifiers,experiments showed that knowledge-line ensemble algorithm had comparable accuracy with random forest which uses normal voting as its coordinating strategy.More importantly,knowledge-line ensemble algorithm can discriminate the difficulty of inference cases according to the active situation of knowledge-lines and give specific training cases to support the inference which makes its results more convinced.

作者于思皓郭嘉丰范意兴兰艳艳程学旗 YU Si-Hao;GUO Jia-Feng;FAN Yi-Xing;LAN Yan-Yan;CHENG Xue-Qi(Key Lab of Netowrk Data Scince and Tchnology,Institute of Computing Techuology,Chinese Academy of Sciences,Bijing 100190;University of Chinese Academy of Sciences,Beijing100190;Institute of Netoork Technology ICT(YANTAI)CAS,Yantai,Shandong 264005)

机构地区中国科学院计算技术研究所网络数据科学与技术重点实验室中国科学院大学烟台中科网络技术研究所

出处《计算机学报》 EI CSCD 北大核心 2021年第3期462-475,共14页 Chinese Journal of Computers

基金国家自然科学基金项目(61722211,61872338,61902381) 北京智源人工智能研究院(BAAI2019ZD0306) 中国科学院青年创新促进会(20144310) 国家重点研发计划(2016QY02D0405) 联想-中科院联合实验室青年科学家项目王宽诚教育基金会重庆市基础科学与前沿技术研究专项项目(重点)(cstc2017jcjyBX0059) 泰山学者工程专项经费(ts201511082)资助。

关键词多分类器知识线记忆理论启发式样本驱动可解释性 multi-classifier knowledge-line memory theory heuristics sample driven interpretability

分类号 TP393 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

同被引文献84

1毛盼,白文辉,张俊梅,张红梅.信息化技术在新型冠状病毒肺炎防控救治工作中的应用[J].中华护理杂志,2020(S01):284-285. 被引量：1
2谷潇.无人机机载激光雷达在地质测绘与工程测量中的应用研究[J].应用激光,2020,40(6):1126-1131. 被引量：43
3王春磊,李夫星,马俊俊.高空薄云影响下的大气长波辐射遥感估算[J].河南大学学报（自然科学版）,2021,51(2):183-192. 被引量：2
4王国胤,张清华.不同知识粒度下粗糙集的不确定性研究[J].计算机学报,2008,31(9):1588-1598. 被引量：100
5王国胤,姚一豫,于洪.粗糙集理论与应用研究综述[J].计算机学报,2009,32(7):1229-1246. 被引量：383
6张春霞,张讲社.选择性集成学习算法综述[J].计算机学报,2011,34(8):1399-1410. 被引量：144
7李志鹏,马田香,杜兰,徐丹蕾,刘宏伟,张子敬.在雷达HRRP识别中多特征融合多类分类器设计[J].西安电子科技大学学报,2013,40(1):111-117. 被引量：13
8孙博,王建东,陈海燕,王寅同.集成学习中的多样性度量[J].控制与决策,2014,29(3):385-395. 被引量：40
9杨安,孙利民,王小山,石志强.工业控制系统入侵检测技术综述[J].计算机研究与发展,2016,53(9):2039-2054. 被引量：68
10杜兰,史蕙若,李林森,孙永光,胡靖.基于分数阶傅里叶变换的窄带雷达飞机目标回波特征提取方法[J].电子与信息学报,2016,38(12):3093-3099. 被引量：17

引证文献7

1张清华,支学超,王国胤,杨帆,薛付忠.基于属性代表的多粒度集成分类算法[J].计算机学报,2022,45(8):1712-1729. 被引量：4
2田如意,顾风军,彭坤,国栩.基于一维Logistic映射和二维Tent映射双混沌思路的网络信息加密[J].计算机测量与控制,2023,31(6):280-286. 被引量：16
3黎佳.基于多分类器集成的ICS入侵检测算法[J].控制工程,2023,30(6):1105-1111. 被引量：3
4刘正坤,林思娜,吴丹妮.偏度特征约束下的机载激光雷达点云数据分类[J].计算机测量与控制,2023,31(9):235-241. 被引量：2
5苑占江,桂改花.基于SAE特征优选和集成学习的半监督网络入侵检测方法[J].中国电子科学研究院学报,2025,20(1):48-55.
6张军,乔溢.基于GAN和集成学习的电力系统网络入侵检测方法[J].电子技术应用,2025,51(12):77-82.
7陈志敏,周涛,梁永.基于SFS特征选择和k-means聚类的网络故障检测方法[J].微型电脑应用,2026,42(1):226-229.

二级引证文献25

1吴成英,张清华,赵凡,程云龙,谢秦,夏书银,王国胤.基于密度峰值聚类的超区间粒化方法及其分类模型[J].计算机学报,2023,46(8):1620-1635. 被引量：6
2崔纪飞.信息加密技术在网络安全中的应用研究[J].西藏科技,2023,45(10):69-76. 被引量：4
3孙立仙.隐蔽通信网络传输信息加密处理方法[J].信息技术与信息化,2024(1):164-169. 被引量：2
4王斌,郑渭渭.移动应用终端网络数据加密传输方法研究[J].信息技术与信息化,2024(2):143-146. 被引量：1
5钟坚.基于加密算法的网络信息防篡改方法研究[J].长江信息通信,2024,37(1):29-31. 被引量：1
6杨兵兵,闫浩文,张黎明,徐欣钰,王小龙,严清博,侯昭阳.基于双混沌映射的遥感图像零水印算法[J].地理与地理信息科学,2024,40(3):21-28. 被引量：4
7徐聪.基于RSA算法的无线通信网络数据加密传输方法[J].长江信息通信,2024,37(5):133-135. 被引量：8
8席嘉龙,李洋,胡鑫康,颜金丰,杨征.机载点云数据精细化处理与大比例尺地形图测绘应用研究[J].市政技术,2024,42(7):212-219. 被引量：2
9王溪波,王硕.基于粒子群优化算法的媒体中继选择策略[J].长江信息通信,2024,37(7):1-4. 被引量：1
10马月红,曹佳琦,韩壮志,刘新悦.基于改进Chebyshev混沌映射的抗比相法IFM测频信号设计[J].电讯技术,2024,64(8):1283-1290. 被引量：3

1王辉.王献昌:智能时代的使命[J].科学中国人,2021(1):17-21.
2徐晨华,叶思超,丰云杰,乔清理.基于SVM+XGBoost集成分类器的inter-patient心律失常心电信号分类[J].国际生物医学工程杂志,2020,43(5):366-371. 被引量：1
3谢飞,林金贵,蓝贤峰,杨文福,邹可安,缪锟.人工关节置换术与关节融合术治疗手部关节畸形的临床疗效对比研究[J].中国医药科学,2020,10(24):222-225. 被引量：1
4刘星彤,李维莹,何思南,李璇,陈宪涛,翟莉莉,徐濛.引导元素的视觉设计对手机AR交互的体验影响研究[J].人类工效学,2020,26(5):41-44.
5蒲悦逸,王文涵,朱强,陈朋朋.基于CNN-ResNet-LSTM模型的城市短时交通流量预测算法[J].北京邮电大学学报,2020,43(5):9-14. 被引量：26

计算机学报

2021年第3期

浏览历史

内容加载中请稍等...

基于知识线记忆的多分类器集成算法被引量：7

同被引文献84

引证文献7

二级引证文献25

相关作者

相关机构

相关主题

浏览历史

基于知识线记忆的多分类器集成算法 被引量：7

同被引文献84

引证文献7

二级引证文献25

相关作者

相关机构

相关主题

浏览历史

基于知识线记忆的多分类器集成算法被引量：7