期刊文献+

一种分布式的K-means聚类算法 被引量:2

Distributed K-means Clustering Algorithm
在线阅读 下载PDF
导出
摘要 提出一种适用于大型数据集的分布式聚类算法。该算法以传统的K-means算法为基础进行合理的改进,使之更适用于分布式环境,并从算法的复杂度分析,将该算法与传统的集中式K-means算法及其他分布式算法进行比较。实验表明,该算法在保持了集中式K-means算法所有必要特性的同时,提高了数据处理速度。 A distributed clustering algorithm suit for large data sets is presented.This algorithm is a modified version of the common K-means algorithm with suitable change for making it executable in distributed environment.The algorithm,the traditional serial K-means algorithm and other existing algorithms are compared on the basis of analysing the complexity of the algorithm.Experimental results show that this distributed algorithm maintains all necessary characteristics of the serial K-means algorithm,as well improves the speed of data processing.
作者 梁建武 田野
出处 《现代电子技术》 2010年第10期11-14,共4页 Modern Electronics Technique
基金 国家自然科学基金资助项目(60773013) 湖南省自然科学基金资助项目(07JJ5078)
关键词 K-MEANS聚类算法 分布式环境 大数据集 复杂度 K-means algorithm distributed environment large database complexity
  • 相关文献

参考文献2

二级参考文献21

  • 1姜园,张朝阳,仇佩亮,周东方.用于数据挖掘的聚类算法[J].电子与信息学报,2005,27(4):655-662. 被引量:70
  • 2王红睿,赵黎明,裴剑.均衡化的改进K均值聚类法[J].吉林大学学报(信息科学版),2006,24(2):172-176. 被引量:13
  • 3宋英姿,李庆武,王晓玲,倪雪.球坐标系下小波收缩去噪方法的改进[J].河海大学常州分校学报,2007,21(1):1-3. 被引量:14
  • 4Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases[C]. In: Proc. ACM SIGMOD Int. Conf. Management of Date, Washington D C, 1993, 207-216.
  • 5Agrawal R, Srikant R. Fast algorithm for mining association rules[C]. In: Proc. 20th Int. Conf. on VLDB, Santiago, Chile, 1994, 487-499.
  • 6Houtsma M, Swami A. Set-oriented mining for association rules in relational databases[A]. In: Yu P, Chen A, eds. Proc. of the Int. Conf. on Data Engineering[C].Los Alamitos, CA: IEEE Computer Society Press, 1995, 25-33.
  • 7Han J, Fu Y. Discovery of multiple-level association rules from large datahases[C]. In: Proc. 21^th Int. Conf. on VLDB, Zurich, Switzerland, 1995, 420-431.
  • 8Han J, Kamber M. Data mining: concepts and techniques[M]. Beiiing: High Education Press, 2001.
  • 9Bayardo R. Efficiently mining long patterns from databases[A]. In: Haas L M, Tiwary A, eds. Proc. of the ACM SIGMOD Int. Conf. On Management of Data[C], New York: ACM Press, 1998, 85-93.
  • 10Lin Dao-I, Kedem Z M. Pincer-search: a new algorithm for discovering the maximum frequent set[A]. In: Schek H J, Saltor F, Ramos I, et al, eds, Proc. of the 6th European Conf. on Extending Database Technology [C], Heidel-berg: Springer- Veriag, 1998, 105-119.

共引文献2

同被引文献13

  • 1郑苗苗,吉根林.DK-Means——分布式聚类算法K-Dmeans的改进[J].计算机研究与发展,2007,44(z2):84-88. 被引量:9
  • 2Inmon W H. Building the data warehouse [ M ]. America : Wiley,2005.
  • 3Gaber M M, Yu P S. A framework for resource- aware knowledge discovery in data streams: A holistic approach with its application [ C ] // Proceedings of the ACM symposium on Applied computing. Dijon, France : ACM Press, 2006 : 649 - 656.
  • 4Jie Yin,Mohamed Medhat Gaber.Clustering distributed time series in sensor networks[C]∥Proceedings of Eighth IEEE International Conference on Data Mining,2008:678-687.
  • 5Phung DN,Gaber MM,Roehm U.Resource-aware online data mining in wireless sensor networks[C]//Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining,2007:139-146.
  • 6Wang Xiaoni,Gao Xuedong.The research of a resource-aware cloud computing architecture based on Web security[C]//Proceedings of 2nd IEEE International Conference on Cloud Computing and Intelligence Systems,2012:572-575.
  • 7张晓龙,曾伟.实时数据流聚类的研究新进展[J].计算机工程与设计,2009,30(9):2177-2181. 被引量:5
  • 8蔡键,王树梅.基于Google的云计算实例分析[J].电脑知识与技术,2009,5(9):7093-7095. 被引量:14
  • 9任家东,周玮玮,何海涛.高维数据流的自适应子空间聚类算法[J].计算机科学与探索,2010,4(9):859-864. 被引量:6
  • 10陈小辉.基于数据挖掘算法的入侵检测方法[J].计算机工程,2010,36(17):72-73. 被引量:14

引证文献2

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部