摘要
针对Apriori算法寻找频繁项集时,需要多次扫描事务数据库和可能产生大量候选项集的问题,提出了一种向量和数组相结合的频繁项集挖掘算法。该算法不仅实现了只扫描事务数据库一次,而且避免了模式匹配,减少了无价值的候选项集的产生。通过与已有算法的比较,验证了本文算法具有较高的挖掘效率,而且数据库的项数越多,此算法的挖掘效果越明显。
To solve the problem that a large number of candidate sets will be generated when an apriori algorithm is used to scan the transaction database many times to look for frequent itemsets,a frequent itemsets mining algorithm is presented based on the combination of vector and array,which can scan the transaction database only once,avoid pattern matching and reduce the generation of worthless candidate sets.In addition,by comparison with the existing algorithms,this algorithm is verified with a high efficiency of mining.And the more items in the database the more effective it is.
出处
《山东大学学报(理学版)》
CAS
CSCD
北大核心
2011年第3期31-34,共4页
Journal of Shandong University(Natural Science)