期刊文献+

K均值聚类算法初始质心选择的改进 被引量:15

K Mean Cluster Algorithm with Refined Initial Center Point
在线阅读 下载PDF
导出
摘要 聚类分析在信息检索和数据挖掘等领域都有很广泛的应用,K均值聚类算法是一个比较简洁和快速的聚类算法,但是它存在着初始聚类个数必须事先设定以及初始质心的选择也具有随机性等缺陷,造成聚类的结果不是最优的。针对K均值聚类算法中的随机指定初始质心的缺点,提出了基于密度和最近邻相似度的初始质心选择算法,实验显示该算法可以生成质量较高而且较稳定的聚类结果,但是改进的算法需要事先设定最近邻相似度的阈值计算量较大等缺点,还有待改进。 Cluster analysis have very extensive application in information retrieval and data mining, in which K mean algorithm is a more succinct and more fast cluster algorithm, but it has one counts in the initial cluster to need establishing in advance, and the choice of the initial centroid has randomness too, this lead to the fact that the result of the cluster is not optimum. To the shortcoming of appointing the initial center at random in cluster's algorithm of K mean, the authors choose the algorithm after putting forward the Shared Nearest Neighbor similar degree on the basis of the density. Experiment reveals this algorithm can produce higher and more steady cluster' s result of quality. But the improved algorithm needs to establish greater Shared Nearest Neighbor similar degree in advance, so the algorithm still remain to improve.
出处 《沈阳师范大学学报(自然科学版)》 CAS 2009年第4期448-450,共3页 Journal of Shenyang Normal University:Natural Science Edition
基金 国家自然科学基金资助项目(60970112)
关键词 聚类 K均值聚类算法 初始质心 密度 最近邻相似度 clustering K-means clustering algorithm initial center point Density SNN(Shared Nearest Neighbor)similar degree
  • 相关文献

参考文献9

二级参考文献50

共引文献178

同被引文献113

引证文献15

二级引证文献131

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部