摘要
关联规则挖掘是数据挖掘及知识发现领域的重要研究内容之一,其核心任务是挖掘数据库中的频繁项集。Apriori及其改良算法是频繁项集挖掘的有效算法。在类Apriori的算法中,它们都采用哈希树来存储频繁项集的候补项集以便快速计算其支持度。该文在仔细分析这些算法所存在的效率瓶颈的基础上,提出了另一个有效的改进算法。所提算法通过利用一个一维数组替代已有算法中的复杂的哈希树来达到改善它们效率瓶颈的目的。通过多个实验评估,该文所提算法的挖掘效率很高,比Apriori及其改良算法要快2到5倍。
Mining of association rules is considered to be on e of the most important data mining tasks.Frequent itemset mining plays an esse ntial role in mining association rules.A lot of previous studies adopt an Apri ori-like approach,in which hash-tree is used to store candidate itemsets base d on analyzing the bottleneck of performance for Apriori-like algorithm,an eff icient algorithm for faster mining of frequent itemsets is proposed in this pa per.It adopts one-dimension array instead of the complex hash-tree structure to expedite the mining process.The several experiments assess the relative perf ormance of the algorithm in comparison with the Apriori and its extended algorit hm.The experiment evaluation shows that the algorithm is faster than both alg orithms by a factor from two to five.
出处
《计算机工程与应用》
CSCD
北大核心
2002年第11期1-4,47,共5页
Computer Engineering and Applications
基金
国家973重点基础研究发展规划项目(编号:G1999032705)
留学回国人员科研启动基金资助