期刊文献+

快速搜索与发现密度峰值聚类算法的优化研究 被引量:34

Optimization of clustering by fast search and find of density peaks
在线阅读 下载PDF
导出
摘要 CFSFDP是基于密度的新聚类算法,可聚类非球形数据集,具有聚类速度快、实现简单等优点。CFSFDP需人工尝试确定密度阈值dc,且对一个类中存在多密度峰值的数据无法进行准确聚类。为解决该缺点,提出基于近邻距离曲线和类合并优化CFSFDP(简称NM-CFSFDP)的聚类算法。算法用近邻距离曲线变化情况自动确定密度阈值dc,采用确定dc的CFSFDP对数据聚类,并利用计算dc值的方法指导类的合并,引入内聚程度衡量参数解决了类合并后不能撤销的难题,从而实现对多密度峰值数据的正确聚类。通过实验对比,NM-CFSFDP算法确实比CFSFDP算法具有更加精确的聚类效果。 CFSFDP algorithm is a new clustering algorithm based on density, which cluster non-spherical data sets. CFSFDP has the advantages of fast clustering speed and simple realization. But the CFSFDP algorithm needs to perform multiple attempts to determine the density threshold dc and the existence of multiple density peaks of one class leads to incorrect cluste- ring. In view of the disadvantages, this paper proposed optimization of CFSFDP based on neighbor distance curve and merging clusters (for short NM-CFSFDP) algorithm. Firstly, the new algorithm gave the density threshold which named dc automatical- ly, the dc was determined by the change of the nearest neighbor distance curve. Secondly, NM-CFSFDP used CFSFDP algo- rithm, which gave dc automatically, to cluster the data set, and then merged the classes that could be merged, and the merging operation could be dynamically revoked in the algorithm. Through the contrast experiment, the NM-CFSFDP algorithm is more accurate than the CFSFDP in clustering.
出处 《计算机应用研究》 CSCD 北大核心 2016年第11期3251-3254,共4页 Application Research of Computers
基金 国家自然科学基金资助项目(61173130)
关键词 聚类 密度峰值 近邻距离曲线 类合并 clustering density peaks nearest neighbor distance curve merging clusters
  • 相关文献

参考文献12

二级参考文献78

共引文献146

同被引文献243

引证文献34

二级引证文献203

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部