摘要
针对传统的K均值聚类分析,不考虑对象中每个变量在聚类过程中体现作用的不同,而是统一看待,用这样计算的距离来表示两个对象的相似度并不确切。文中提出了一种基于距离度量的聚类算法,算法使用新的距离度量代替了K均值聚类算法的欧式距离,应用新的距离度量之后,数据点的权重不再只为1或0,而是由系数来确定,这就将硬划分转化为软划分。最后经过实验证明了改进的聚类算法比传统的K均值聚类收敛速度有了很大提高,提高了算法的执行效率。
Traditional K-means clustering analysis does not consider the different objects in each variable to reflect the role of the clustering process, but a unified look at the distance calculated in this way to represent the similarity between two objects is not exact. This paper presents a clustering algorithm based on distance metric, the algorithm uses a new distance metric instead of Euclidean distance of the K-means clustering algorithm, and apply the new distance measure, the data points in the right weight is no longer only for 1 or 0, while determinect by the coefficient, which will be hard to divide into soft division. The experiments show that the improved clustering algorithm has been greatly improved convergence rate than the traditional K-means clustering to improve the efficiency of the implementation of the algorithm.
出处
《电子设计工程》
2012年第22期86-88,共3页
Electronic Design Engineering
关键词
数据挖掘
算法
欧氏距离
K均值聚类分析
data mining
algorithm
Euclidean distance
K-means clustering analysis