期刊文献+

基于结构相似性和压缩变换的聚类方法 被引量:5

Clustering Method Based on Structural Similarity and Compressive Transformation
原文传递
导出
摘要 针对聚类分析在处理任意形状、任意密度和具有一定结构特征的数据集时存在的不足,首先在数据空间中建立离散拓扑流形,通过在此结构上定义邻域密度相似性和邻域密度变化光滑性两个相对性度量标准,并利用可达性给出样本结构相似性和类结构的定义,证明类结构关系是一个等价关系.然后将结构相似性当作吸引力,设计基于压缩变换的聚类方法,该方法具备处理任意形状、任意密度和解释性好等许多优点.最后在人工数据集和标准数据集上的比较实验结果表明,该方法在聚类效率和有效性上都明显优于其它聚类算法. The current clustering methods are difficult to handle the complicated problems in which shapes and densities are changing along with the data. To overcome the shortcomings of existing clustering methods, based on discrete topological manifold created in the data space, the structural similarity of samples and the class structure are described by accessibility after defining two new relativity metrics: the neighborhood density similarity and the smoothness of neighborhood density changes. The class structure relationship is proved to an equivalence relation. Then, a clustering algorithm is designed based on compressive transformation by treating the structural similarity defined on samples as the attractiveness. The algorithm is designed to handle data with any shapes and any density, maintaining good interpretability and many other advantages. Experimental result on the artificial data sets and standarddata sets shows that the method is superior to the state-of-the-art methods.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2011年第5期637-644,共8页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金(No.60903103 10872085) 四川省科技厅应用基础研究基金(No.07JY029-125)资助项目
关键词 聚类分析 离散拓扑流形 结构相似性 类结构 压缩变换 Cluster Analysis, Discrete Topological Manifold, Structural Similarity, Class Structure,Compressive Transformation
  • 相关文献

参考文献15

  • 1~ichard O, Duda P E, Hart D G S. Pattern Classification. 2nd Edi- ion. New York, USA: John Wiley & Sons, 2001.
  • 2Theodoridis S, Koutroumbas K. Pattern Recognition. 2nd Edition. Amsterdam, Netherlands: Elsevier, 2003.
  • 3Zhang Tian, Ramakrishnan R, Livny M. BIRCH : An Efficient Data Clustering Method for Very Large Databases // Proc of the ACM SIGMOD International Conference on Management of Data. Montre- al, Canada, 1996: 103- 114.
  • 4Ester M, Kriegel H P, Sander J, et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise /// Proc of the ACM SIGKDD International Conference on Management of Data. Montreal, Canada, 1996:226 - 231.
  • 5Wang Wei, Yang Jiong, Muntz R. STING: A Statistical Information Grid Approach to Spatial Data Mining// Proc of the 23rd Intema-tional Conference on Very Large Databases. Athens, Greece, 1997: 186 - 196.
  • 6Xu Linli, Neufeld J, Larson B, et al. Maximum Margin Clustering //Saul L K, Weiss Y, Bottou L, eds. Advances in Neural Informa- tion Processing Systems. Cambridge, USA: MIT Press, 2005, XVII, 1537 - 1544.
  • 7Chan P M, Schlag M D F, Zien J Y. Spectral k-Way Ratio-Cut Par- titioning and Clustering // Proc of the 30th International Design Automation Conference. Dallas, USA, 1993 : 749 - 754.
  • 8Frey B J, Dueck D. Clustering by Passing Messages between Data Points. Science, 2007, 315(5814): 972-976.
  • 9Shuai Dianxun, Dong Yumin, Shuai Qiug. A New Data Clustering Approach: Generalized Cellular Automata. Information Systems, 2007, 32(7): 968-977.
  • 10Zhang Chaolin, Zhang Xuegong, Zhang M Q, et al. NeighborNumber, Valley Seeking and Clustering. Pattern Recognition Let- ters, 2007, 28(2) : 173 -180.

同被引文献66

  • 1杨善林,李永森,胡笑旋,潘若愚.K-MEANS算法中的K值优化问题研究[J].系统工程理论与实践,2006,26(2):97-101. 被引量:197
  • 2梁昌勇,吴坚,陆文星,丁勇.一种新的混合型多属性决策方法及在供应商选择中的应用[J].中国管理科学,2006,14(6):71-76. 被引量:41
  • 3袁方,周志勇,宋鑫.初始聚类中心优化的k-means算法[J].计算机工程,2007,33(3):65-66. 被引量:157
  • 4Frey B J,Dueck D. Clustering by passing messages between data points[J]. Science, 2007,315 (5814) : 972-976.
  • 5Pollard D. Strong consistency of Kmeans clustering[J]. Ailnals of Statistics, 1981,9 (1) : 135-140.
  • 6Zhang T, Ramakrishnan R, Livny M. BIRCH. An efficient data clustering method for very large databases[J]. Montreal, 1996,6 (96) :103-114.
  • 7Pat N R, Bezdek J C. On cluster validity for the fuzzy c-means model[J]. IEEE Transactions on Fuzzy Systems, 1995, 3 (3): 370-379.
  • 8Tsang I W,Kwok J T,Cheung P M. Core vector machines: fast SVM training on very large data sets[J]. Journal of Machine Learning Research, 2005,8(6) : 363-392.
  • 9Deng Zhao-hong, Choi K S, Chung F L, et al. Enhanced soft sub- space clustering integrating within cluster and between bluster- Information[J]. Pattern Recognition, 2010,43 (3) : 767-781.
  • 10Liu Jun,Mohammed J, Carter J, et al. Distance based clustering of CGH data[J]. Bioinformaties, 2006,22(16) : 1971 -1978.

引证文献5

二级引证文献46

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部