期刊文献+

一种融合变异系数的k-mean聚类分析方法 被引量:5

K-means clustering algorithm based on coefficient of variation
在线阅读 下载PDF
导出
摘要 K-means聚类算法的性能依赖于距离度量的选择,k-means算法将欧几里德距离作为最常用的距离度量方法。欧氏距离认为所有属性在聚类中作用是相同的,但是这种距离度量方法并不能准确反映样本间的相异性。针对这种不足,提出了融合变异系数的k-means聚类分析方法(CV-k-means),利用变异系数权重向量来减少不相关属性的影响。实验结果表明,该方法的聚类结果优于k-means算法。 The performance of k-means clustering algorithm depends on the selection of distance metrics. The Euclid distance is commonly chosen as the similarity measure in k-means clustering algorithm, which treats all features equally and does not accurately reflect the dissimilarity among samples. K-means clustering algorithm based on Coefficient of Variation(CV-k-means) is proposed in this paper to solve this problem. The CV-k-means clustering algo- rithm uses variation coefficient weight vector to decrease the affects of irrelevant features. The experimental results show that the proposed algorithm can generate better clustering results than k-means algorithm.
出处 《计算机工程与应用》 CSCD 2012年第35期114-117,共4页 Computer Engineering and Applications
关键词 K-MEANS算法 相异性度量 变异系数 k-means clustering dissimilarity measure weighting coefficient of variation
  • 相关文献

参考文献17

  • 1Han J,Kamber M.Data mining concepts and technique[M].2nd ed.[S.l.]:Morgan Kaufmann Publishers,2006:383-386.
  • 2Xu R.Survey of clustering algorithm[J].IEEE Trans on Neural Netw,2005,16:645-678.
  • 3Chinrungrueng C.Evaluation of heterogeneous architectures for artificial neural networks[D].Berkeley:University of California,1993.
  • 4Lloyd S P.Least squares quantization in PCM[J].IEEE Trans on Information Theory,1982,IT-28:129-137.
  • 5Moody J,Darken C J.Fast learning in networks of locally-tuned processing units[J].Neural Computation,1989,1:281-294.
  • 6Jiang D,Tang C,Zhang A.Cluster analysis for gene ex-pression data:a survey[J].IEEE Trans on Knowledge and Data Eng,2004,16:1370-1386.
  • 7Chen X,Yin W,Tu P,et al.Weighted k-means algorithm based text clustering[C]//Proceedings of International Symposium on Information Engineering and Electronic Commerce,2009:51-55.
  • 8Mitchell T M.Machine learning[M].New York:McGraw-Hill,1997:230-247.
  • 9王熙照,王亚东,湛燕,袁方.学习特征权值对K-均值聚类算法的优化[J].计算机研究与发展,2003,40(6):869-873. 被引量:50
  • 10He X.Coefficient of variation and its application to strength prediction of adhesively bonded joints[C]//Proceedings of Internationsl Conference on Measuring Technology and Mechatronics Automation,2009:602-605.

二级参考文献12

  • 1邱保志,沈钧毅.网格聚类中的边界处理技术[J].模式识别与人工智能,2006,19(2):277-280. 被引量:13
  • 2邱保志,沈钧毅.基于扩展和网格的多密度聚类算法[J].控制与决策,2006,21(9):1011-1014. 被引量:25
  • 3袁方,周志勇,宋鑫.初始聚类中心优化的k-means算法[J].计算机工程,2007,33(3):65-66. 被引量:157
  • 4Han Jiawei, Kamber M. Data Mining : Concepts and Techniques. Orlando, USA: Morgan Kaufmann Publishers, 2001.
  • 5Xia Chenyi, Hsu W, Lee M L, et al. BORDER: Efficient Computation of Boundary Points. IEEE Trans on Knowledge and Data Engineering, 2006, 18(3) : 289 -303.
  • 6Hsu C M, Chen M S. Subspace Clustering of High Dimensional Spatial Data with Noises// Proc of the Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Sydney, Australia, 2004:31 -40.
  • 7Breunig M M, Kriegel H P, Ng R T, et al. LOF: Identifying Density-Based Local Outliers// Proc of the ACM SIGMOD International Conference on Management of Data. Dalles, USA, 2000:93 - 104.
  • 8Karypis G, Ham E H, Kumar V. Chameleon : A Hierarchical Clustering Algorithm Using Dynamic Modeling. IEEE Computer, 1999, 32 (8) : 68 -75.
  • 9Tan Pang-ning,Steinbaeh M,Kumar V.Introduction to data mining[M]. [S.l.] : Addison Wesley, 2005.
  • 10Han Jia-wei,Kamber M.Data mining:Concepts and techniques[M]. [S.l.]:Morgan Kaufmann Publishers,2001.

共引文献161

同被引文献62

引证文献5

二级引证文献54

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部