期刊文献+

一种改进的频繁无规则集集合开采算法

An Optimized Algorithm for Mining Frequent Rule-Free Sets
在线阅读 下载PDF
导出
摘要 数据挖掘的一个基本任务是在海量数据的数据库中开采频繁项目集。本文提出了一种方法,不用开采频繁项目集全集,而是开采它的一个称为频繁无规则集集合的精简集。我们能用频繁无规则集集合还原出完整的频繁项目集集合和它们的精确支持度而不用读取数据库。可以看到,对频繁无规则集集合的开采是高效的。我们给出了一个算法HOPEIII来开采频繁无规则集集合,并将它和算法AClose进行了比较。实验结果显示,HOPEIII在任何情况下都比AClose的性能更好。 Given a large collection of transactions containing items, a basic common problem is to extract the so-called frequent itemset. The idea presented in this paper is to extract a condensed representation of the frequent itemsets called rule-free sets, instead of extracting the whole frequent itemset collection. We show that this condensed representation can be used to regenerate all frequent patterns and their exact frequencies without any access to the original data. An algorithm named HOPE-Ⅲ is given to extract the frequent rule-free sets. We compared it with an algorithm named A-Close which extracts another condensed representation of frequent itemsets previously investigated in the literature called frequent closed sets. The experiments show that in all cases, HOPE-Ⅲ is much more efficient than A-Close.
作者 赵栋 卢炎生
出处 《计算机工程与科学》 CSCD 2005年第9期62-63,共2页 Computer Engineering & Science
基金 十五国家科技攻关计划资助项目(2001BA102A040203)
关键词 数据挖掘 精简集 频繁项目集 无规则集 data mining condensed representation frequent itemset rule-free set
  • 相关文献

参考文献5

  • 1H Mannila, H Toivonen. Levelwise Search and Borders of Theories in Knowledge Discovery[J]. Data Mining and Knowledge Discovery, 1997, 1(3): 241-258.
  • 2A Bykowski, C Rigotti. A Condensed Representation to Find Frequent Patterns[A]. Proc of the 20th ACM SIGACT-SIGMOD-SIGART Symp on Principles of Database Systems (PODS 2001)[C].2001. 267-273.
  • 3N Pasquier, Y Bastide, R Taouil,et al. Efficient Mining of Association Rules Using Closed Itemset Lattices[J]. Information Systems, 1999, 24(1): 25-46.
  • 4J Boulicaut, A Bykowski, C Rigotti. Approximation of Frequency Queries by Mean of Free-Sets[A].Proc of the 4th European Conf on Principles and Practice of Knowledge Discovery in Databases (PKDD'00)[C].2000.75-85.
  • 5J Pei, J Han, R Mao. Closet: An Efficient Algorithm for Mining Frequent Closed Itemsets[A].Proc of the 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery[C].2000.21-30.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部