摘要
针对传统特征选择中只考虑了特征的相关性和冗余性而忽略了特征间交互作用的问题,提出一种基于交互信息的两阶段特征选择算法(SAMBFC)。通过对称不确定性和强近似马尔可夫毯原理进行无关特征和冗余特征的筛选;利用特征间交互增益和基于相关性特征选择算法构建一种特征间互补性评价方法,选取具有交互作用的冗余特征。在9个不同维度的标准数据集上与8种典型算法进行对比实验和分析,其结果表明,SAMBFC算法所选特征的分类性能以及综合表现明显优于其它算法。
A two-stage feature selection algorithm based on interactive information(strong approximate Markov blanket and feature complementary,SAMBFC)was proposed to solve the problem that only the correlation and redundancy of features are considered in traditional feature selection and the interaction between features is ignored.Independent features and redundant features were screened according to symmetric uncertainty and strong approximation Markov blanket principle.The interactive gain between features and the feature selection algorithm based on correlation were used to construct the complementarity evaluation method between features,and the redundant features with interaction were selected.Compared with 8 typical algorithms on 9 standard data sets of different dimensions,the results show that the classification performance and comprehensive performance of SAMBFC algorithm are obviously better than that of other algorithms.
作者
刘强
降爱莲
LIU Qiang;JIANG Ai-Lian(College of Information and Computer,Taiyuan University of Technology,Jinzhong 030600,China)
出处
《计算机工程与设计》
北大核心
2023年第1期125-132,共8页
Computer Engineering and Design
基金
山西省回国留学人员科研基金项目(2017-051)。
关键词
特征选择
两阶段
强近似马尔可夫毯
对称不确定性
相关性
冗余性
互补性
feature selection
two stages
strong approximation Markov blanket
symmetrical uncertainty
correlation
redundancy
complementarity