期刊文献+

基于索引效用的Top-k高效用项集挖掘方法 被引量:3

A Top-k High Utility Itemset Mining Method Based on the Index Utility
在线阅读 下载PDF
导出
摘要 已有的Top-k高效用项集挖掘为了保持向下封闭性,利用项集的事务效用代替其真实效用,使得项集效用被估计得过大,导致剪枝效果不好,挖掘效率较低.针对这一问题,提出了索引效用的概念,在此基础上建立两级索引,并进行索引剪枝,增强了挖掘中剪枝的效果,提高了Top-k高效用项集挖掘的效率;此外,通过建立效用矩阵,支持对项集效用的快速计算,进一步提高了挖掘效率.不同类型数据集上的实验验证了所提出的Top-k高效用项集挖掘方法的有效性和高效性. The existing methods of Top-k high utility itemset mining substitute the transaction utilities of itemsets for their real utilities in order to keep the downward closure property. This makes the utilities of itemsets be estimated too large, resulting in bad pruning effect and low mining efficiency. To solve this problem, the concept of the index utility was proposed. On this basis, the two-level index was built and pruned, by which the pruning effect was strengthened and the efficiency of Top-k high utility itemset mining was enhanced. Moreover, the fast calculation of itemset utilities was supported by building the utility matrix. Therefore, the mining efficiency was further enhanced. The experiments on different types of datasets validate the effectiveness and the efficiency of the proposed method.
出处 《东北大学学报(自然科学版)》 EI CAS CSCD 北大核心 2016年第1期24-28,共5页 Journal of Northeastern University(Natural Science)
基金 国家自然科学基金资助项目(61272177)
关键词 项集效用 索引效用 Top—k高效用项集 尾超项集 效用矩阵 itemset utility the index utility Top-k high utility itemset ending super itemset utility matrix
  • 相关文献

参考文献14

  • 1Han J, Kamber M,Pei J. Data mining : concept and technique[M]. 3rd ed. Beijing:China Machine Press,2012.
  • 2Brin S, Motwani R, Ullman J D, et al. Dynamic itemsetcounting and implication rules for market basket data [ C ] //Proceedings of ACM SIGMOD Conference on Managementof Data. Tucson, 1997:255 -264.
  • 3毛国君,宗东军.基于多维数据流挖掘技术的入侵检测模型与算法[J].计算机研究与发展,2009,46(4):602-609. 被引量:25
  • 4杨欢,张玉清,胡予濮,刘奇旭.基于权限频繁模式挖掘算法的Android恶意应用检测方法[J].通信学报,2013,34(S1):106-115. 被引量:48
  • 5Agrawal R, Srikant R. Fast algorithms for mining associationrules [C]// Proceedings of the 20th VLDB Conference.Santiago de Chile, 1994:487 -499.
  • 6Agrawal R,Imielinski T, Swami A. Mining association rulesbetween sets of items in large databases[ C]// Proceedings ofthe ACM SIGMOD Conference on Management of Data.New York:ACM Press, 1993 :207 -216.
  • 7Pei J,Han J,Lu H,et al. H-Mine:hyper-structure mining offrequent patterns in large databases[C]// IEEE InternationalConference on Data Mining. Piscataway ,2001:441 -448.
  • 8Han J, Pei J. Mining frequent patterns without candidategeneration : a frequent-pattem tree approach [J]. Data Miningand Knowledge Discovery, 2004,8(1) :53 -87.
  • 9Yao H,Hamilton H J, Geng L. A unified framework forutility-based measures for mining itemsets [ C ] // Proceedingsof ACM SIGKDD 2nd Workshop on Utility-Based DataMining. Philadelphia,2006:28 -37.
  • 10Tseng V,Wu C W,Shie B E, et al. UP-growth: an efficientalgorithm for high utility itemset mining [ C ] // Proceedingsof KDD’ 10. Washington DC,2010:253 -263.

二级参考文献14

  • 1郑军,胡铭曾,云晓春,郑仲.基于数据流方法的大规模网络异常发现[J].通信学报,2006,27(2):1-8. 被引量:16
  • 2郭山清,谢立,曾英佩.入侵检测在线规则生成模型[J].计算机学报,2006,29(9):1523-1532. 被引量:14
  • 3刘旭,毛国君,孙岳,刘椿年.数据流中频繁闭项集的近似挖掘算法[J].电子学报,2007,35(5):900-905. 被引量:14
  • 4Lee W, Stolfo S J. Data mining approaches for intrusion detection [C] //Proc of the 7th USENIX Security Symposium. Berkeley, USA: USENIX Assoc, 1998: 79-93
  • 5Lee W, Stolfo S J. A framework for constructing features and models for intrusion detection systems [C] //ACM Trans on Information and System Security. New York: ACM Press, 2000:227-261
  • 6EI-Semary A, Edmonds J, Gonzalez Pino J. Applying data mining of fuzzy association rules to network intrusion detection[C] //Proc of 2006 IEEE Information Assurance Workshop. Piscataway, NJ : IEEE, 2006 : 100-107
  • 7Pornoy L. Intrusion detection with unlabeled data using clustering [C] //Proc of ACM CSS Workshop on Data Mining Applied to Security. New York: ACM, 2001
  • 8Zanero S, Sacaresi S M. Unsupervised learning techniques for an intrusion detection system [C]//Proc of the 2004 ACM Symp on Applied Computing. New York: ACM, 2004
  • 9Oh S, Kang J, Byun Y, et al. Intrusion detection based on clustering a data stream [C] //Proc of the 3rd ACIS Int Conf on Software Engineering Research, Management and Applications(SERA'05). Los Alamitos: IEEE Computer Society, 2005:220-227
  • 10Dong G, Han J, Lakshmanan L, et al. Online mining of changes from data streams: research problems and preliminary results [C] //Proc of the 2003 Workshop on Management and Processing of Data Streams (MPDS2003). New York: ACM, 2003:225-236

共引文献71

同被引文献14

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部