摘要
快速搜索和发现密度峰值聚类(DPC)算法是一种基于密度的聚类算法。该算法不需要迭代和过多的设定参数,但由于计算局部密度时没有考虑数据的局部结构,导致无法识别簇密度小的聚类中心。针对此问题,提出基于K互近邻(KN)和核密度估计(KDE)的DPC(KKDPC)算法。通过K近邻和核密度估计方法得到数据点的K互近邻数量和局部核密度;将K互近邻数量与局部核密度进行加和获得新的局部密度;根据数据点的局部密度得到相对距离,并通过构建决策图选取聚类中心及分配非中心点。利用人工数据集和真实数据集进行实验,并与DPC、基于密度的噪声空间聚类应用(DBSCAN)、K-means、模糊C均值聚类算法(FCM)、基于K近邻的DPC(DPCKNN)、近邻优化DPC(DPC-NNO)、基于模糊加权共享邻居的DPC(DPC-FWSN)算法进行对比。通过计算调整互信息(AMI)、调整兰德指数(ARI)、归一化互信息(NMI)来验证KKDPC算法的性能。实验结果表明:KKDPC算法能更加准确地识别聚类中心,有效地提高聚类精度。
Clustering by fast search and find of density peaks(DPC)algorithm is a density-based clustering algorithm that does not require iteration or too many parameter settings.However,it fails to identify cluster centers with low cluster density because the local structure of data is not considered when computing local density.To solve this problem,a DPC algorithm based on K-reciprocal neighbors(KN)and kernel density estimation(KDE),called KKDPC was proposed.The number of KN and local kernel density of data points were obtained using the k-nearest neighbor and KDE methods.The number of KNs and local kernel density were weighted to obtain the new local density.The relative distance of data points was obtained based on their local density,and cluster centers and noncenter points were selected based on the decision graph.Experiments were conducted on artificial and real datasets and compared with seven clustering algorithms including DPC,density-based spatial clustering of applications with noise(DBSCAN),K-means,fuzzy C-means clustering(FCM),DPC based on K nearest neighbors(DPC-KNN)algorithm,DPC with nearest neighbor optimization(DPC-NNO)algorithm,and DPC-FWSN algorithm.The performance of the KKDPC algorithm was verified by calculating the adjusted mutual information(AMI),adjusted Rand index(ARI),and normalized mutual information(NMI).The experimental results show that the proposed KKDPC algorithm can accurately identify cluster centers and improve clustering accuracy effectively.
作者
周玉
夏浩
刘虹瑜
白磊
ZHOU Yu*;XIA Hao;LIU Hongyu;BAI Lei(School of Electrical Engineering,North China University of Water Resources and Electric Power,Zhengzhou 450045,China)
出处
《北京航空航天大学学报》
北大核心
2025年第6期1978-1990,共13页
Journal of Beijing University of Aeronautics and Astronautics
基金
国家自然科学基金(U1504622,31671580)
河南省青年骨干教师培养计划(2018GGJS079)。
关键词
聚类算法
密度峰值
K近邻
K互近邻
核密度估计
clustering algorithm
density peak
K-nearest neighbor
K-reciprocal neighbor
kernel density estimation