期刊文献+

混合属性数据聚类融合算法 被引量:9

Cluster ensemble method for databases with mixed numeric and categorical values
原文传递
导出
摘要 混合属性数据集是现实世界特别是商业金融数据库中最普遍的数据集类型,但适用于这类数据集的聚类算法极少。该文根据聚类融合的方法体系,针对混合属性数据集的特点,提出了基于聚类融合的混合属性特征聚类算法(CEM C),建立了算法框架,列出了算法目标函数和算法主要步骤,并分析了算法复杂度。该算法可以有效处理混合属性海量数据集。用真实数据集验证了算法,并将此算法应用于实际的客户关系管理数据分析中,得到了较好效果。 Real world intelligent databases always have mixed numeric and categorical values which are difficult to cluster. An ensemble-based mixed attribute cluster model was developed for mixed numeric and categorical databases based on the cluster ensemble method. The objective function and the methodology are described in the paper, The method has excellent sealability. Experimental results on real datasets show that the clustering accuracy is better than existing mixed numeric and categorical data clustering algorithms.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2006年第10期1673-1676,共4页 Journal of Tsinghua University(Science and Technology)
基金 国家自然科学基金资助项目(70202008)
关键词 聚类融合 混合属性 客户关系管理 cluster ensemble mixed numeric and categorical customer relationship management
  • 相关文献

参考文献8

  • 1HAN Jiawei,Kamber M.Data Mining Concepts and Techniques[M].San Francisco:Morgan Kaufmann Publishers,2001.
  • 2MacQueen J.Some methods for classification and analysis of multivariate observations[C]∥ Proc 5th Berkeley Symp on Math Statist.Berkeley:ACM Press,2003:281-297.
  • 3HUANG Zhexue.Extensions to the k-means algorithm for clustering large data sets with categorical values[J].Data Mining and Knowledge Discovery,1998,(2):283-304.
  • 4陈宁,陈安,周龙骧.数值型和分类型混合数据的模糊K-Prototypes聚类算法(英文)[J].软件学报,2001,12(8):1107-1119. 被引量:49
  • 5Strehl A,Ghosh J.Cluster ensembles-A knowledge reuse framework for combining partitions[J].Journal on Machine Learning Research,2002,(3):583-617.
  • 6Topchy A,Jain A,Punch W.A mixture model for clustering ensembles[C]∥ Proc SIAM Data Mining.Florida:SIAM Press,2004.379-390.
  • 7阳琳贇,王文渊.聚类融合方法综述[J].计算机应用研究,2005,22(12):8-10. 被引量:28
  • 8HE Zengyou,XU Xiaofei,DENG Shengchun.A cluster ensemble method for clustering categorical data[J].Information Fusion,2005,(6):143-151.

二级参考文献27

  • 1Huang Zhexue,IEEE Transactions Fuzzy Systems,1999年,7卷,4期,446页
  • 2Huang Zhexue,Data Mining and Knowledge Discovery,1998年,2卷,283页
  • 3Huang Zhexue,Proc the 1st Pacific Asia Conference on Knowledge Discovery and Data Mining,1997年,21页
  • 4R O Duda, P E Hart, D G Stork. Pattern Classification (2nd Edition) [M]. New York: Wiley, 2001. 454-458.
  • 5A Strehl, J Ghosh. Cluster Ensembles: A Knowledge Reuse Framework for Combining Multiple Partitions[J]. Journal of Machine Learning Research, 2003, 3(3): 583-617.
  • 6A L Fred. Finding Consistent Clusters in Data Partitions[C]. Proceedings of the 2nd International Workshop on Multiple Classifier Systems, Volume 2096 of Lecture Notes in Computer Science, Springer, 2001. 309-318.
  • 7A Topchy, A K Jain, W Punch. A Mixture Model for Clustering Ensembles [C]. Proceedings of the 4th SIAM International Conference on Data Mining, 2004. 379-390.
  • 8B Minaei-Bidgoli, A Topchy, W F Punch. A Comparison of Resampling Methods for Clustering Ensembles [C]. Intl. Conf. on Machine Learning, Models, Technologies and Applications(MLMTA 2004), 2004. 939-945.
  • 9B Minaei-Bidgoli, A Topch, W F Punch. Ensembles of Partitions via Data Resampling [C]. Proceedings International Conference on Information Technology, Coding and Computing(ITCC 2004),Volume 2, 2004. 188-192.
  • 10S Dudoit, J Fridlyand. Bagging to Improve the Accuracy of a Clustering Procedure [J]. Bioinformatics, 2003, 19(9): 1090-1099.

共引文献75

同被引文献83

引证文献9

二级引证文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部