期刊文献+

TLWCC:一种双层子空间加权协同聚类算法

TLWCC: A Two-Level Subspace Weighting Co-clustering Algorithm
在线阅读 下载PDF
导出
摘要 协同聚类是对数据矩阵的行和列两个方向同时进行聚类的一类算法。本文将双层加权的思想引入协同聚类,提出了一种双层子空间加权协同聚类算法(TLWCC)。TLWCC对聚类块(co-cluster)加一层权重,对行和列再加一层权重,并且算法在迭代过程中自动计算块、行和列这三组权重。TLWCC考虑不同的块、行和列与相应块、行和列中心的距离,距离越大,认为其噪声越强,就给予小权重;反之噪声越弱,给予大权重。通过给噪声信息小权重,TLWCC能有效地降低噪声信息带来的干扰,提高聚类效果。本文通过四组实验展示TLWCC算法识别噪声信息的能力、参数选取对算法聚类结果的影响程度,算法的聚类性能和时间性能。 Co-clustering algorithms cluster a data matrix into row clusters and column clusters simultaneously. In this paper, we propose TLWCC, a two-level subspace weighting co-clustering algorithm, and introduces the idea of a two-level subspace weighting method into the co-clustering process. TLWCC adds the first level of weights on co-clusters, and then adds the second level of weights on rows and columns. The three types of weights (co-cluster, row and column weights) are computed in the clustering progress, according to the distances between co-clusters (or rows, columns) and their centers. The larger the distance is, the stronger noise it implies, so a smaller weight is given and vice verse. Thus, by giving small weights to noise, TLWCC filters out the noise and improves the co-clustering result. We propose an iterative algorithm to optimize the model. We carried out four experiments to learn more about TLWCC. The first experiment investigated the properties of three types of weights. The second experiment studied how the clustering result was influenced by the parameters. The third experiment compared the clustering performance of TLWCC with other three algorithms. The fourth experiment examined the computational efficiency of our proposed algorithm.
出处 《集成技术》 2013年第1期16-22,共7页 Journal of Integration Technology
关键词 协同聚类 加权 聚类 数据挖掘 co-clustering weighting clustering data mining
  • 相关文献

参考文献17

  • 1Madeira S C,Oliveira A L. Biclustering algorithms for biological data analysis:a survey[J].Computational Biology and Bioinformatics IEEE/ACM Transactions on IEEE,2004,(01):24-45.
  • 2Song Y,Pan S,Liu S. Constrained Text Co-Clustering with Supervised and Unsupervised Constraints[J].IEEE Transactions on Knowledge and Data Engineering,2012.
  • 3George T,Merugu S. A scalable collaborative filtering framework based on co-clustering[A].2005.4-4.
  • 4Li J,Shao B,Li T. Hierarchical Co-Clustering:A New Way to Organize the Music Data[J].IEEE Transactions on Multimedia,2012,(02):471-481.
  • 5Fan N,Boyko N,Pardalos P M. Recent advances of data biclustering with application in computational neuroscience[J].Computational Neuroscience Springer,2010.105-132.
  • 6Guo G,Chen S,Chen L. Soft subspace clustering with an improved feature weight self-adjustment mechanism[J].International Journal of Machine Learning and Cybernetics Springer,2012,(01):39-49.
  • 7Chen X J,Ye Y M,Huang J Z. A feature group weighting method for subspace clustering of high-dimensional data[J].Pattern Recognition,2012,(01):434-446.
  • 8Deng Z,Choi K S,Chung F L. Enhanced soft subspace clustering integrating within-cluster and between-cluster information[J].Pattern Recognition Elsevier,2010,(03):767-781.doi:10.1016/j.patcog.2009.09.010.
  • 9Jing L,Ng M,Huang Z. An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data[J].IEEE Transactions on Knowledge and Data Engineering,2007,(08):1026-1041.doi:10.1109/TKDE.2007.1048.
  • 10Hartigan J A. Direct clustering of a data matrix[J].Joumal of the American Statistical Association JSTOR,1972.123-129.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部