期刊文献+

kNN算法在文本分类中的改进 被引量:4

Improvement of kNN Algorithm in Text Classification
在线阅读 下载PDF
导出
摘要 kNN算法用已归类的数据训练分类器,它是一种基于实例研究(instancebased learning)文本分类算法.本文在研究kNN算法的基础上,结合k邻近法和最近特征线法的思想,提出了新的分类方法,k最近特征线法(k nearest feature line,kNFL),将其运用于文本分类中,汲取了kNN算法和NFL算法的优点,降低了偶然误差,提高了算法适应性和分类精度. kNN (k nearest neighbor) algorithm use labeled data to build the classifier. It is a instance-base learning algorithm. In this paper we will try to improve it on the basis of researching the kNN algorithm. Propose a novel classifier, kNFL, which combines the advantage of the kNN algorithm and the NFL algorithm, it reduces accidental error, and advances the algorithm's flexibility and classifier's precision when it is applied to the text categorization.
作者 胡荣 罗庆云
出处 《南华大学学报(自然科学版)》 2005年第3期78-80,共3页 Journal of University of South China:Science and Technology
关键词 文本分类 KNN算法 NFL算法 k最近特征线法 text categorization kNN algorithm NFL algorithm k nearest feature line
  • 相关文献

参考文献6

  • 1Joachims T.A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization[J].Machine Learning: Proceedings of the Fourteenth International Conference,1997.
  • 2Belur V.Dasarathy.Nearest Neighbor(NN) Norms:NN Pattern Classification Techniques.McGraw-Hill Computer Science Series[M].IEEE Computer Society Press,Las Alamitos,California,1991.
  • 3边肇祺 张学工.模式识别[M].北京:清华大学出版社,1999.282-283.
  • 4Lewis D,Shapire R E,Callan J P,et al.Training algorithms for linear text classifiers[M].In Porceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,1996.
  • 5Kamal Nigam,Andrew McCallum.Text Classification from Labeled and Unlabeled Documents using EM[M].Machine Learning,2000.
  • 6McCallum,Andrew Kachites."Bow: A toolkit for statistical language modeling,text retrieval,classification and clustering." http://www.cs.cmu.edu/~mccallum/bow.1996.

共引文献142

同被引文献25

引证文献4

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部