期刊文献+

频繁项集挖掘的研究与进展 被引量:10

Research and Advances of Frequent Itemsets Mining
在线阅读 下载PDF
导出
摘要 挖掘频繁项集是许多数据挖掘任务中的关键问题,也是关联规则挖掘算法的核心,所以提高频繁项集的生成效率一直是近几年数据挖掘领域研究的热点之一,研究人员从不同的角度对算法进行改进以提高算法的效率。该文从频繁项集生成过程中解空间的类型、搜索方法和剪枝策略、数据库的表示方法、数据压缩技术等几个方面对频繁项集挖掘的基本策略进行了研究,对完全频繁项集挖掘、频繁闭项集挖掘和最大频繁项集挖掘的典型算法特别是最新算法进行了介绍和评述,并分析了各种算法的性能特点,指出其适于哪种类型的数据集。最后,对频繁项集挖掘算法的发展方向进行了初步的探讨。 Mining the frequent itemsets is a key problem in data mining. It is also the core of the algorithm for mining association rules. Therefore, to improve the efficiency of discovering the frequent itemsets is the issue in data mining area. Many researchers have done lots of work to improve the algorithms from different perspectives. In this paper, we present an overview of the basic strategies for mining the frequent itemsets from different aspects such as the types of search space, search methods and pruning strategies, the representation methods of the databases, data compression techniques. Some representative algorithms, especially new algorithms in all frequent itemsets, frequent closed itemsets and maximal frequent itemsets are introduced and commented. We analyze the performance of these algorithms and point out which kind of datasets the algorithm fit for. At last, the future directions of the algorithms for mining frequent itemsets are discussed.
出处 《计算机仿真》 CSCD 2006年第4期68-73,共6页 Computer Simulation
基金 国家基础研究发展基金(973计划 G1999032701) 江苏省自然科学基金(BK2002091)
关键词 数据挖掘 频繁项集 搜索方法 剪枝策略 Data mining Frequent itemsets Search method Pruning strategy
  • 相关文献

参考文献21

  • 1R Agrawal,T Imielinski and A Swami.Mining association rules between sets of items in large databases[M].Washington,D.C.SIGMOD'93,207-216.
  • 2R Agrawal and R Srikant.Fast algorithms for mining association rules[C].In J.B.Bocca,M.Jarke,and C.Zaniolo,editors,Proceedings 20th International Conference on Very Large Databases,Morgan Kaufmann,1994.487-499.
  • 3S Brin,et al.Dynamic Itemset Counting and Implication Rules for Market Basket Analysis[M].In SIGMOD'97,1997.255-264.
  • 4Ashoka Savasere,Edward Omiecinski,Shamkant B Navathe.An Efficient Algorithm for Mining Association Rules in Large Databases[M].VLDB 1995.432-444.
  • 5J S Park,M S Chen and P S Yu.An effective hash-based algorithm for mining association rules[M].SIGMOD'95,San Jose,CA,May 1995.
  • 6Zaki and Hsiao.CHARM:An Efficient Algorithm for Closed Itemset Mining,Proc.2002 SIAM Int.Conf[M].Data Mining (SDM'02),Arlington,VA,April 2002.457-473.
  • 7R J Bayardo,Jr.Efficiently mining long patterns from databases[C].In L.M.Haas and A.Tiwary,editors,Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data,SIGMOD Record,ACM Press,1998,27(2):85-93.
  • 8J Han,J Pei and Y Yin.Mining Frequent Patterns without Candidate Generation[C].,Proc.2000 ACM-SIGMOD Int.Conf.on Management of Data (SIGMOD'00),Dallas,TX,May 2000.
  • 9R Agarwal,C Aggarwal and V V V Prasad.A tree projection algorithm for generation of frequent itemsets[J].In Journal of Parallel and Distributed Computing,2000.
  • 10M J Zaki.Scalable algorithms for association mining[J].IEEE Transactions on Knowledge and Data Engineering,2000,12(3):372-390.

同被引文献64

引证文献10

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部