期刊文献+

一种改进的K-Modes聚类算法 被引量:7

An Improved K-Modes Clustering Algorithm
在线阅读 下载PDF
导出
摘要 为了改善传统K-Modes聚类算法相异度度量公式弱化了类内相似性,忽略了属性间差异,以及单一属性值的Modes忽视了某一属性可能存在多属性值组合,且算法受初始中心点影响很大的缺点,基于多属性值Modes的相异度度量方法提出MAV-K-Modes算法,并采用一种基于预聚类的初始中心选取方法。使用UCI数据集进行实验,结果表明,MAV-K-Modes算法相比于传统K-Modes算法,其正确率、类精度和召回率都有明显提升,且MAV-K-Modes算法适合于并行化改造。 The dissimilarity measure method of traditional K-Modes clustering algorithm suffers from some shortcomings,such as weakening the similarity within a class,ignoring the difference between attributes,and the Modes with single attribute value neglects that a property may have multiple attribute value combinations,and the algorithm is greatly affected by the initial center points. A MAV-K-Modes algorithm is proposed based on the dissimilarity measure method of multi-attribute value Modes,and an initial center selection method based on pre-clustering is adopted. The results of experiments using UCI datasets show that the MAV-K-Modes algo-rithm has a significant improvement in accuracy rate,precision rate and recall rate compared with the traditional K-Modes algorithms, and the MAV-K-Modes algorithm is suitable for parallel transformation.
作者 贾彬 梁毅 苏航 JIA Bin;LIANG Yi;SU Hang(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China)
出处 《软件导刊》 2019年第6期60-64,69,共6页 Software Guide
基金 国家自然科学基金青年项目(61202074)
关键词 聚类算法 相异度度量 初始中心点 多属性值Modes K-Modes clustering algorithm dissimilarity measure initial center points multi-attribute value Modes K-Modes
  • 相关文献

参考文献6

二级参考文献92

  • 1李洁,高新波,焦李成.基于特征加权的模糊聚类新算法[J].电子学报,2006,34(1):89-92. 被引量:117
  • 2陈宗海,文锋,聂建斌,吴晓曙.基于节点生长k-均值聚类算法的强化学习方法[J].计算机研究与发展,2006,43(4):661-666. 被引量:13
  • 3王丽娟,关守义,王晓龙,王熙照.基于属性权重的Fuzzy C Mean算法[J].计算机学报,2006,29(10):1797-1803. 被引量:47
  • 4HAN J MICHELINE K著 范明 孟小峰 译.数据挖掘:概念与技术[M].北京:机械工业出版社,2001..
  • 5Han Jiawei,Kamber M.Data Mining Concepts and Techniques[M].San Francisco:Morgan Kaufmann,2001.
  • 6Brendan J F,Delbert D.Clustering by passing messages between data points[J].Science,2007,315(16):972-976.
  • 7Zhang Jiangshe,Liang Yiuwing.Improved possibilistic c-means clustering algorithms[J].IEEE Trans on Fuzzy Systems,2004,12(2):209-217.
  • 8Mac Q J.Some methods for classification and analysis of multivariate observation[C]//Proc of the 5th Berkley Symp on Mathematical Statistics and Probability.Berkley,California:University of California Press,1967:281-297.
  • 9Huang Zhexue.Clustering large data sets with mixed numeric and categorical values[C]//Proc of PAKDD97.Singapore:World Scientific,1997:21-35.
  • 10Huang Zhexue.Extensions to the K-means algorithm for clustering large data sets with categorical values[J].Data Mining and Knowledge Discovery,1998,2(3):283-304.

共引文献77

同被引文献51

引证文献7

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部