摘要
相关规则比传统的关联规则更具有实际意义。但现存的相关规则挖掘算法均需利用apriori类似算法挖掘具有高支持度的项集,再对获得的项集进行相关性测试而获取相关规则,这导致低支持度-高相关度的规则不易被发现。直接挖掘相关规则的困难在于候选相关项不能利用apriori类似性质进行剪枝,导致搜索空间爆炸性增长。本文提出的算法MNI利用Phi相关系数的下界来产生候选负相关项,从而缩小负相关项搜索空间,并证明了该算法的完全性和正确性。在负相关项对基础上利用规则可靠度产生负相关规则时,提出将负相关对计数统一转化为正相关对计数的方法。在真实数据集上的实验结果表明,该算法MNI能有效提高负相关项对的挖掘速度。
High correlation rules are more practical than traditional association rules,but existed correlation rule mining algorithms are almost apriori-based. This results in the difficulty of finding correlation rules with low support but high correlation. In this paper a new algorithm called MNI is introduced to use the lower bound of Phi correlation coefficient to generate all candidate negative correlation items and reduce explosive search space. Both the completeness and correctness of MNI are proved. Negative correlation rules are mined using reliability measure without directly counting the number of negative correlation pairs. Experiments on real datasets show that the algorithm is quite efficient in negative correlation items mining.
出处
《计算机科学》
CSCD
北大核心
2005年第10期124-127,163,共5页
Computer Science
基金
高等学校博士学科点专项科研基金
基于浓缩数据立方的联机分析处理与梯度挖掘(项目编号20030487032)