摘要
在分析已有不完备信息熵的基础上,提出一种基于相似关系的不完备信息熵,并证明该信息熵的若干性质.给出一个不完备数据特征选择算法,算法以改进的不完备信息熵作为特征选择准则,直接对不完备数据的特征进行熵值分析,并采用顺序前向浮动选择方法解决特征间的相关性问题.最后在UCI实测数据集上的实验表明,文中算法具有更高的准确率和更快的特征选择速度.
Grounded on the analysis of the existing incomplete information entropy, the concept of incomplete information entropy based on similarity relations (SIIE) is proposed, and some properties of SIIE are discussed. A feature selection algorithm for incomplete data is presented. In this algorithm, SIIE of incomplete data is calculated directly, and SIIE is taken as the criteria for feature selection. Then, the sequential forward floating search method is employed to addresses the problem of correlation among features. Experiments on UCI database are carried out, and the results indicate the accuracy and efficiency of the proposed algorithm.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2014年第12期1131-1137,共7页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金项目(No.61005010)
安徽省自然科学基金项目(No.1308085MF84
1408085MF135)
安徽省高校省级自然科学基金项目(No.KJ2012B149
2013SQRL074ZD)
合肥学院重点建设学科项目(No.2014XK08)
合肥学院学科带头人培养对象项目(No.2014dtr08)资助
关键词
特征选择
不完备数据
不完备信息熵
不完备决策表
相似关系
Feature Selection, Incomplete Data, Incomplete Information Entropy, IncompleteDecision Table, Similarity Relation