期刊文献+

基于贝叶斯网络的频繁模式兴趣度计算及剪枝 被引量:4

Computing and Pruning Method for Frequent Pattern Interestingness Based on Bayesian Networks
在线阅读 下载PDF
导出
摘要 采用贝叶斯网络表示领域知识,提出一种基于领域知识的频繁项集和频繁属性集的兴趣度计算和剪枝方法 BN-EJTR,其目的在于发现与当前领域知识不一致的知识,以解决频繁模式挖掘所面临的有趣性和冗余问题.针对兴趣度计算过程中批量推理的需求,BN-EJTR提供了一种基于扩展邻接树消元的贝叶斯网络推理算法,用于计算大量项集在贝叶斯网络中的支持度;同时,BN-EJTR提供了一种基于兴趣度阈值和拓扑有趣性的剪枝算法.实验结果表明,与同类方法相比,方法 BN-EJTR具有良好的时间性能,而且剪枝效果明显;分析发现,经过剪枝后的频繁属性集和频繁项集相对于领域知识符合有趣性要求. Based on background knowledge represented as a Bayesian network, this paper presents a BN-EJTR method that computes the interestingness of frequent items and frequent attributes, and prunes. BN-EJTR seeks to find inconsistent knowledge relative to background knowledge and to resolve the problems of un-interestingness and redundancy faced by frequent pattern mining. To deal with the demand of batch reasoning in Bayesian networks during computing interestingness, BN-EJTR provides a reasoning algorithm based on extended junction tree elimination for computing the support of a large number of items in a Bayesian network. In addition, BN-EJTR is equipped with a pruning mechanism based on a threshold for topological interestingness. Experimental results demonstrate that BN-EJTR has a good time performance compared with the same classified methods, and BN-EJTR also has effective pruning results. The analysis indicates that both the pruned frequent attributes and the pruned frequent items are un-interesting in respect to background knowledge.
出处 《软件学报》 EI CSCD 北大核心 2011年第12期2934-2950,共17页 Journal of Software
基金 国家自然科学基金(60828005 60975034 61070131)
关键词 频繁模式 贝叶斯网络 邻接树 兴趣度 剪枝 frequent pattern Bayesian network junction tree interestingness pruning
  • 相关文献

参考文献21

  • 1Cheng H, Yan XF, Han JW, Hsu CW. Discriminative frequent pattern analysis for effective classification. In: Proc. of the 23rd Int'l Conf. on Data Engineering. Los Alamitos: IEEE Computer Society Press, 2007. 716-725. [doi: 10.1109/ICDE.2007.367917].
  • 2Ohsaki M, Kitaguchi S, Okamoto K, Yokoi H, Yamaguchi T. Evaluation of rule interestingness measures with a clinical dataset on hepatitis. In: Boulicaut JF, Esposito F, Giannotti F, Pedreschi D, eds. Proc. of the 8th European Conf. on Principles of Data Mining and Knowledge Discovery. LNCS 3202, Heidelberg: Springer-Verlag, 2004. 362-373. [doi: 10.1007/978-3-540-30116-5_34].
  • 3黄名选,严小卫,张师超.基于矩阵加权关联规则挖掘的伪相关反馈查询扩展.软件学报,2009,20(7):1854-1865.http://www.jos.org.cn/1000-9825/3368.htm [doi:10.3724/SP.J.1001.2009.03368].
  • 4Jaroszewicz S, Scheffer T. Fast discovery of unexpected patterns in data relative to a Bayesian network. In: Grossman R, Bayardo ILl, Bennett KP, eds. Proc. of the 1 lth ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 2005. 118-127. [doi: 10.1145/1081870.1081887].
  • 5Padmanabhan B, Tuzhilin A. Small is beautiful: discovering the minimal set of unexpected patterns. In: Proc. of the 6th ACM- SIGKDD Int'l Conf. on Knowledge Discovery" and Data Mining. New York: ACM Press, 2000. 54-63. [doi: 10.1145/347090. 347103].
  • 6Jensen FV, Nielsen TD. Bayesian Networks and Decision Graphs. 2nd ed., New York: Springer-Verleg, 2007. 109-166. [doi: 10.1007/s00362-009-0201-4].
  • 7胡学钢,胡春玲.一种基于依赖分析的贝叶斯网络结构学习算法[J].模式识别与人工智能,2006,19(4):445-449. 被引量:10
  • 8Zhang H, Padmanabhan B, Tuzhilin A. On the discovery of significant statistical quantitative rules. In: Kim W, Kohavi R, Gehrke J, DuMouchel W, eds. Proc. of the 10th ACM-SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 2004.374-383. [doi: 10.1145/1014052.1014094].
  • 9Blanchard J, Guillet F, Gras R, Briand H. Using information-theoretic measures to assess association rule interestingness. In: Proc. of the 5th Int'l Conf. on Data Mining. Washington: IEEE Computer Society, 2005.66-73. [doi: 10.1109/ICDM.2005.149].
  • 10Cooper GF. The computational complexity of probabilistic inference using Bayesian belief networks. Artificial Intelligence, 1990, 42(2-3):393-405. Idol: 10.1016/0004-3702(90)90060-D].

二级参考文献14

  • 1Chickering D M, Herkerman D, Meek C. Large-Sample Learning of Bayesian Networks is NP-Hard. Journal of Machine Learning Research, 2004, 5 : 1287-1330
  • 2Cooper G F, Herskovits E. A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning,1992, 9(4): 309-347
  • 3Cheng J, Greiner R, Kelly J. Learning Bayesian Networks from Data: An Efficient Information- Theory Based Approach. Artificial Intelligence, 2002, 137( 1 --2) : 43--90
  • 4Verma T, Pearl J. An Algorithm for Deciding if a Set of Observed Independencies Has a Causal Explanation. In: Dubois D,Wellman M P, et al, eds. Proc of the 8th Conference on Uncertainty in Artificial Intelligence. Stanford, USA: Morgan Kaufmann, 1992, 323--330
  • 5Sprites P, Glymour C, Scheines R. Causality from Probability.In: Mckee G, ed. Evolving Knowledge in Natural and Artificial Intelligence. London, UK:Pitman, 1990,181-- 199
  • 6SpritesP, GlymourC, ScheinesR. An Algorithm for Fast Recovery of Sparse Causal Graphs. Social Science Computer Review,1991, 9(1): 62--72
  • 7Peng H C, Ding C. Structure Search and Stability Enhancement of Bayesian Networks. In: Proc of the 3rd IEEE International Conference on Data Mining. Melbourne, USA, 2003, 621--624
  • 8姚宏亮 王浩 胡学钢 汪荣贵.基于遗传算法和MDL原则的贝叶斯网络结构优化算法[J].南京大学学报:自然科学版,2002,38(2):23-27.
  • 9Pearl J. Probabilistic Reasoning in Intelligente Systems: Networks of Plausible Inference. San Mateo, USA: Morgan Kaufmann, 1988
  • 10Wong M L, Leung K S. An Efficient Data Mining Method for Learning Bayesian Networks Using an Evolutionary Algorithm-Based Hybrid Approach. IEEE Trans on Evolutionary Computation, 2004, 8(4): 378-404

共引文献78

同被引文献25

  • 1梁开健,梁泉,杨炳儒.关联规则挖掘中阈值协调器的设计与实现[J].系统工程与电子技术,2005,27(10):1800-1802. 被引量:3
  • 2马建庆,钟亦平,张世永.基于兴趣度的关联规则挖掘算法[J].计算机工程,2006,32(17):121-122. 被引量:20
  • 3Han J,Kamber M.数据挖掘概念与技术[M].范明,译.北京:机械工业出版社,2007:32-59.
  • 4AGRAWAL R,IMIELINSKI T,SWAMI A.Mining association rules between sets of items in large databases[C] // SIGMOD '93:Proceedings of the 1993 ACM SIGMOD Conference on Management of Data.New York:ACM,1993:207-216.
  • 5SRIKANT R, AGRAWAL R. Mining generalized association rules[C] // VLDB '95: Proceedings of the 21st International Conference on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1995: 407-419.
  • 6SAVASERE A, OMIECINSKI E, NAVATHE S. Mining for strong negative associations in a large database of customer transactions[C] // Proceedings of the 14th International Conference on Data Engineering. Washington, DC: IEEE Computer Society, 1998: 494-502.
  • 7JALALVAND A, MINAEI B, ATABAKI G, et al. A new interestingness measure for associative rules based on the geometric context[C] //ICCIT '08: Proceedings of the 2008 Third International Conference on Convergence and Hybrid Information Technology. Washington, DC: IEEE Computer Society, 2008: 199-203.
  • 8李裕奇, 赵联文,王沁,等. 非参数统计方法[M].成都: 西南交通大学出版社,2010: 116-119.
  • 9MICHAEL J A, GORDON S L. 数据挖掘技术:市场营销销售与客户关系管理领域应用[M]. 2版. 别荣芳,尹静,邓六爱,译. 北京:机械工业出版社,2006.
  • 10曾安平,黄永平,李广军,阳万安,唐自力.基于协方差的FP-Growth算法在ERP中的研究与应用[J].数学的实践与认识,2008,38(12):11-18. 被引量:2

引证文献4

二级引证文献49

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部