期刊文献+

基于维度索引表的改进KNN分类算法 被引量:3

Improved KNN Classification Algorithm Based on Dimension Index Table
原文传递
导出
摘要 阐述传统KNN分类器的基本原理和其存在的不足之处;针对样本数量增大,维度上升时KNN算法中相似度计算量急剧增大的问题,提出基于维度索引表的改进KNN分类算法;该算法通过建立特征项维度索引表加速KNN算法中寻找K近邻;以搜狗自然语言实验室的文本分类语料库中的新闻文档作为实验对象,采用宏平均F测度值作为分类效果评价标准,用改进KNN方法和传统KNN方法进行对比实验。实验结果表明:该方法能大幅度减少寻找K近邻时相似度计算的次数。 In addition to elaborate the basic principle and existing shortcomings of traditional KNN classifier, this paper puts forward the improved KNN classification algorithm based on dimension index table, which according to the increasing number of samples and rapidly increasing problems of similarity computation of KNN algorithm when dimension rises. The algorithm accelerates the search of finding K-nearest neighbor in KNN algorithm by establishing the feature dimension index table. With the news docu- ment in the text categorization corpus of Sogou Natural Language Lab as the experimental object, the comparative experiment was carried out with the improved KNN algorithm and traditional KNN algorithm evaluated by Macro-averaging F-measures. The experi- mental result shows that this method can greatly reduce the times of similarity computation when searching K-nearest neighbor.
出处 《情报理论与实践》 CSSCI 北大核心 2014年第5期102-106,共5页 Information Studies:Theory & Application
基金 国家自然科学基金资助项目"面向文本分类的多学科协同建模理论与实验研究"的成果之一 项目编号:71373291
关键词 文本分类 维度索引表 向量空间模型 分类算法 text categorization dimension index table vector space model classification algorithm
  • 相关文献

参考文献11

  • 1LIU Yu, CHEN Guisheng. KNN algorithm improving based on cloud model [ C ]. 2010 2nd International Conference on Ad- vanced Computer Control ( ICACC ) . Changsha, 2010 : 63-66.
  • 2ZHOU Lijuan, WANG Linshuang, GE Xuebin, et al. A clus- tering-based KNN improved algorithm CLKNN for text classifica- tion [ C ] // Automation and Robot ( CAR' 10 ). Proceed-ings of the 2nd International Asia Conference on Informatics in Control. Piscataway, NJ, USA: IEEE Press, 2010: 212-215. H.
  • 3UANG Hong, GUO Juan, WANG Ben. An improved KNN al- gorithm based on adaptive cluster distance bounding for high di- mensional indexing [ C] //2012 Third Global Congress on In- telligent Systems, 2012: 213-217.
  • 4COVER T M, HART R E. Nearest neighbor pattern classifica- tion [J]. IEEE Transactions on Information Theory, 1967, 13 (1) : 21-27.
  • 5HART P E. The condensed nearest neighbor rule [ J ]. IEEE Transactions on Information Theory, 1968, 14 (3) : 515-516.
  • 6WILSON D I. Asymptotic properties of nearest neighbor rules u- sing edited data [J]. IEEE Transactions on Systems, Man and Cybernetics, 1972, 2 (3): 408-421.
  • 7PIERRE A. KITTLER D. Pattem recognition: a statistical ap- proach [ M]. Englewood Cliffs: Prentice Hall, 1982.
  • 8梁俊杰,王长磊.利用分区和距离实现高维空间快速KNN查询[J].计算机研究与发展,2007,44(11):1980-1985. 被引量:4
  • 9刘海博,郗亚辉,王煜.用于文本分类的快速KNN算法[J].河北大学学报(自然科学版),2008,28(3):322-326. 被引量:5
  • 10张国英,沙芸,江慧娜.基于粒子群优化的快速KNN分类算法[J].山东大学学报(理学版),2006,41(3):120-123. 被引量:8

二级参考文献22

  • 1王晓晔,王正欧.K-最近邻分类技术的改进算法[J].电子与信息学报,2005,27(3):487-491. 被引量:25
  • 2王煜,王正欧.基于模糊决策树的文本分类规则抽取[J].计算机应用,2005,25(7):1634-1637. 被引量:13
  • 3乔玉龙,潘正祥,孙圣和.一种改进的快速k-近邻分类算法[J].电子学报,2005,33(6):1146-1149. 被引量:26
  • 4董道国,刘振中,薛向阳.VA-Trie:一种用于近似k近邻查询的高维索引结构[J].计算机研究与发展,2005,42(12):2213-2218. 被引量:10
  • 5J S Pan, Y L Qiao, S H SUN. A fast K nearest neighbors classification [J]. IEICE Trans Fundamentals, 2004, 87 (4) :961 - 963.
  • 6景丽萍 高阳 吴国宝.基于K—means特征加权算法的大规模文本数据子空间聚类[J].计算机研究与发展,2005,42:85-85.
  • 7Songbo Tan. Neighbor weighted K-nearest neighbor for unbalanced text corpus[J]. Expert Systems with Applications, 2005,28(4) : 667 - 671.
  • 8W J Hwang, K W Wen. Fast KNN classification algorithm based on partial distance search[J]. Electron Lett, 1998, 34(21) :2062 -2063.
  • 9J Kennedy, R C Eberhart. Particle swarm optimization[A]. Proceedings of the 1995 IEEE International Conference on Neural Networks[C]. Perth, Australia: IEEE Service Center, Piscataway, NJ, 1995. 1942- 1948.
  • 10E Chavez,G Navarro,R Baeza-Yates,et al.Searching in metric spaces[J].ACM Computing Surveys,2001,33(3):273-321

共引文献12

同被引文献25

引证文献3

二级引证文献43

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部