摘要
提出了一种改进的关联规则生成算法 ,其目的是在大型数据库中能够高效的发现关联知识。为了达到这个目标 ,将泛逻辑中的广义相关系数与 Apriori算法相结合。 Apriori算法本身对于大型数据库来说是高效的 ,但通常会产生出太多的关联规则 ,而广义相关系数是一个新的能够度量相关性的参数 ,因此对它进行了详细分析 ,并与原算法所使用的条件概率方法进行了比较 ,该算法有效地改进了由 Agrawal提出的关联规则生成算法。
Association rules can help make database more useful. We propose selecting more meaningful assocation rules from those already discovered by the original Apriori algorithm; the selection is done by merging generalized correlation coefficient with Apriori algorithm. Generalized correlation coefficient is a new measurement of the quality of association rules; it is a new concept in the book on Universal Logics theory, which is written by the 2nd author et al and will soon be published. In section 2, we discuss how to use the concept of generalized correlation coefficient to select more meaningful rules from the too many rules already discovered by Apriori algorithm. For a particular database taken as example, there are 22 rules already discovered by Apriori algorithm; using the concept of generalized correlation coefficient we select out of 22 rules the 12 more meaningful ones, which are rules 3, 5, 6, 8, 10, 13, 15, 16, 17, 19, 20, 22 (Fig.3).
出处
《西北工业大学学报》
EI
CAS
CSCD
北大核心
2001年第4期639-643,共5页
Journal of Northwestern Polytechnical University
基金
国家教委博士学科点专项科研基金 (980 6 992 3)
陕西省自然科学基金 (98X15 )