期刊文献+

基于Spark的精准关联规则挖掘算法实现 被引量:4

Implementation of precision association rule mining algorithm based on Spark
在线阅读 下载PDF
导出
摘要 为在大数据环境中精确地进行关联规则挖掘,基于分布式框架Spark,改进关联规则挖掘算法Apriori,解决使用该算法处理大规模数据时遇到的单机内存资源限制和性能缺陷,同时保证结果准确度。利用开源数据集和海量轨迹数据集评估算法的有效性,实验结果表明:与传统方法相比,改进后的Apriori算法进行规则挖掘能够得到相同准确度的结果,并且通过增加处理节点的数量灵活扩展待挖掘数据规模,从而使关联规则挖掘不再受数据规模限制。 In order to accurately carry out association rule mining in big data environment,this paper uses the distributed computing framework Spark,improving the association rules algorithm Apriori. It solved the standalone memory resource constraint and reduced time performance problems caused by Apriori. Then,using open source data sample andmassive data sample of tracks for experiments,the experiments show that compared with the traditional Apriori,the improved Apriori can get the same accurate of results,and the size of the sample can be expanded by increasing the number of nodes,so that the association rule mining is no longer limited by data scale.
出处 《信息技术》 2018年第2期153-158,共6页 Information Technology
关键词 关联规则挖掘 分布式计算 大数据 APRIORI SPARK association rule mining distributed computing big data Apriori Spark
  • 相关文献

参考文献12

二级参考文献104

  • 1章志刚,吉根林.基于迭代式MapReduce的Apriori算法设计与实现[J].华中科技大学学报(自然科学版),2012,40(S1):9-12. 被引量:8
  • 2徐章艳,刘美玲,张师超,卢景丽,区玉明.Apriori算法的三种优化方法[J].计算机工程与应用,2004,40(36):190-192. 被引量:71
  • 3黄龙军,段隆振,章志明.一种基于上三角项集矩阵的频繁项集挖掘算法[J].计算机应用研究,2006,23(11):25-26. 被引量:11
  • 4李超,余昭平.基于矩阵的Apriori算法改进[J].计算机工程,2006,32(23):68-69. 被引量:44
  • 5Agrawal R, Srikant R. Fast algorithms for mining associa- tion rules in large database[C]//Proc of the 20th International Conference on Very Large Databases, 1994.
  • 6Jiawei Han,Jian Pei,Yiwen Yin.Mining frequent patterns without candidate generation[J].ACM SIGMOD Record.2000(2)
  • 7TANBEER S K,AHMED C F,JEONG B S.Parallel anddistributed frequent pattern mining in large databases[].th IEEE International Conference on High PerformanceComputing and Communications.2009
  • 8Tu F,He B.A parallel algorithm for mining association rules based on FP-tree[].Advances in computer scienceenvironmentecoinformaticsand education.2011
  • 9S. Xue-Li,L. Tao.Association rules parallel algorithm based on FP-tree[].ProcndInt Computer Engineering and Technology.2010
  • 10Yang X,Liu Z,Fu Y.MapReduce as a programming model for association rules algorithm on Hadoop[].Proceedings of rd International Conference on Information Sciences and Interaction Sciences.2010

共引文献150

同被引文献26

引证文献4

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部