期刊文献+

基于K-均值聚类的小样本集KNN分类算法 被引量:10

KNN CLASSIFICATION ALGORITHM FOR SMALL SAMPLE SETS BASED ON K-MEANS CLUSTERING
在线阅读 下载PDF
导出
摘要 KNN及其改进算法进行分类时,如样本集中、样本过少或各类样本的密度差异较大,都将会影响最后的分类精度。提出一种基于聚类技术的小样本集KNN分类算法。通过聚类和剪理,形成各类的样本密度接近的新的样本集,并利用该新样本集对类标号未知数据对象进行类别标识。通过使用标准数据集的测试,发现该算法能够提高KNN的分类精度,取得了较满意的结果。 When KNN and its improved algorithms are performing classification,it always influences the final classification accuracy because of either too dense or too few the samples or too large the density differences among various kinds of samples.The paper proposes a small sample set KNN classification algorithm based on clustering technology.A new sample set is generated through clustering and editing which contains various kinds of samples with close densities.That new sample set is used to classify and label data objects whose classification and label numbers are unknown.Tests by standard data sets reveal that the algorithm can improve KNN classification accuracy and obtain satisfactory results.
出处 《计算机应用与软件》 CSCD 2011年第5期112-113,125,共3页 Computer Applications and Software
基金 甘肃省自然科学研究基金规划项目(1010RJZA069)
关键词 K-均值聚类 K最近邻 小样本 K-means clustering K-nearest-neighbor Small sample set
  • 相关文献

参考文献7

  • 1余小鹏,周德翼.一种自适应k-最近邻算法的研究[J].计算机应用研究,2006,23(2):70-72. 被引量:18
  • 2桑应宾,刘琼荪.改进的k-nn快速分类算法[J].计算机工程与应用,2009,45(11):145-146. 被引量:8
  • 3熊忠阳,杨营辉,张玉芳.基于密度的kNN分类器训练样本裁剪方法的改进[J].计算机应用,2010,30(3):799-801. 被引量:13
  • 4Guan D,Yuan W,tee Y K,et al.Nearest neighbor editing aided by unlabeled data[J].Information Sciences,2009,179(13):2273-2282.
  • 5Wang J,Neskovic P,Cooper L N.Improving nearest neighbor rule with a simple adaptive distance measure[J].Pattern Recognition Letters,2007,28(2):207-213.
  • 6Jahromi M Z,Parvinnia E,John R.A method of learning weighted similarity function to improve the performance of nearest neighbor[J].Information Sciences,2009,179(17):2964-2973.
  • 7Krishma K,Murty M N.Genetic K-means algorithm[J].IEEE Trans on System,Man,and Cybernetics:Part B,1999,5(1):96-100.

二级参考文献20

  • 1张宁,贾自艳,史忠植.使用KNN算法的文本分类[J].计算机工程,2005,31(8):171-172. 被引量:101
  • 2王汉芝,刘振全.一种新的确定K-均值算法初始聚类中心的方法[J].天津科技大学学报,2005,20(4):76-79. 被引量:9
  • 3王煜,白石,王正欧.用于Web文本分类的快速KNN算法[J].情报学报,2007,26(1):60-64. 被引量:33
  • 4周志勇,袁方,刘海博.用聚类-分类模式解决聚类问题[J].广西师范大学学报(自然科学版),2007,25(2):127-130. 被引量:7
  • 5谭松波,王月粉.中文文本分类语料库-TanCorpv1.0[EB/OL].(2007-08-29)[2008-01-20].http://www.searehforum:org.cn/tansongbo/corpus.htm.
  • 6Krishma K,Murty M N.Genetic K-means algorithm[J].IEEE Trans on System, Man, and Cybernetics: Part B, 1999,5 ( 1 ) : 96-100.
  • 7RUIZ V E.An algorithm for finding nearest neighbors in (approximately) constant average time[J].Pattern Recognition Letter,1986,4(3):145-147.
  • 8HART P E.The condensed nearest neighbor rule[J].IEEE Transactions on Information Theory,1968,IT214(3):515-516.
  • 9WILSON D L.Asymptotic properties of nearest neighbor rules using edited data[J].IEEE Transactions on Systems,Man and Cybernetics,1972,2(3):408-421.
  • 10DEVIJVER P,KITTLER J.Pattern recognition:A statistical approach[M].Englewood Cliffs:Prentice Hall,1982.

共引文献33

同被引文献80

引证文献10

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部