摘要
本文提出了一种改进的基于互信息的特征选择方法,与改进的TF-IDF权值公式相结合对文本特征进行选择,提高了特征项信息利用效率。试验表明,该算法提高了文本分类正确率。
In this paper, we put forward an improved feature selection method that based on MI. Combined with the improved weighting formula TF - IDF to select text feature, we have increased the using efficiency about the character information. The experiment shows that this method has improved the precision of the text classification.
出处
《情报科学》
CSSCI
北大核心
2007年第10期1534-1537,共4页
Information Science
关键词
信息增益
互信息
信息比值
特征选择
文本分类
information gain
multi - information
information ratio
feature selection
text classification