期刊文献+

基于Spark框架的聚类算法研究 被引量:9

Research on Clustering Algorithm Based on Spark Framework
在线阅读 下载PDF
导出
摘要 大数据的挖掘是当今的研究热点,也有着巨大的商业价值。新型框架Spark部署在Hadoop平台上,它的机器学习算法几乎可以完全替代传统的Mahout Map Reduce的编程模式,但由于Spark的内存模型特点,执行速度快。该文研究了Spark中的机器学习中的聚类算法KMeans,先分析了算法思想,再通过实验分析其应用的方法,然后通过实验结果分析其应用场景和不足。 Mining big data is current research hotspot, also have a huge commercial value.A new framework of Spark is deployed on the Hadoop platform, in which machine learning algorithms can be almost completely replace the traditional Mahout Map Reduce programming mode. But the characteristics of Spark memory model, efficiency of execution is high. This paper studies the KMeans clustering algorithm in Spark machine learning。The first analyze the idea of the algorithm, and then through the experimental analyze method and its application, and then through results of experimental analyze its application scenarios and lacks.
作者 陈虹君 CHEN Hong-jun (ChengDu College of University Of Electronic Science And Technology of China, Chengdu 611731, China)
出处 《电脑知识与技术》 2015年第2期56-57,60,共3页 Computer Knowledge and Technology
关键词 大数据 HADOOP SPARK 机器学习 聚类 KMeans big data Hadoop Spark machine learnin clustering KMeans
  • 相关文献

参考文献2

  • 1机器学习库[EB/OL].2013.http://blog.csdn.ne:johnny_lee/article/details/25656343.
  • 2最近的spark文档[EB/OL].2014.http://spark.apache.org/docs/latest/.

共引文献1

同被引文献67

引证文献9

二级引证文献58

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部