期刊文献+

事务型滑动窗口下的数据流频繁模式挖掘 被引量:1

Mining frequent patterns in data stream with transaction-sensitive sliding window
在线阅读 下载PDF
导出
摘要 作为数据流挖掘的一个重要研究问题,滑动窗口下的数据流频繁模式挖掘近年来得到了广泛应用和研究。已有的算法大多要对数据流中所有的数据都进行处理,而现实中用户往往只关注事物的某些方面,由此借鉴MFI-TransSW算法,提出了一种基于事务型滑动窗口的算法BSW-Filter(Bit Sliding Window with Filter)。算法采用比特序列实现滑动窗口操作,同时由于增加了频繁项的筛选,减少了所需保存的数据项个数,从而减小了内存使用和提升处理速度。算法的空间复杂度与滑动窗口大小以及数据流取值范围无关,特别适用于周期较长数据范围广的数据挖掘。分析和实验验证了该算法的可行性和有效性。 As one of the most important problems in data stream mining,the frequent patterns mining with a sliding win- dow is widely researched and used in many fields.Exiting algorithms need process all elements in the data stream, whereas users only focus on several aspects of things.So inspired by the MFI-TransSW algorithm,a new algorithm based on transac- tion-sensitive sliding window is proposed in this paper, in which a sequence of bits is used to implement the sliding win- dow operation.In addition, a mechanism of filtering frequent items, which decreases the memory usage and improve the effi- ciency of processing, because of the reduction of items retained in memory.Furthermore as space complexity is independent to the size of sliding window and the value range of elements, this method is specially applicable to discovery of data with a wide range of values in a long period.The analysis and experiments show the feasibility and effectiveness of the algorithm.
作者 胡彧 王顺平
出处 《计算机工程与应用》 CSCD 北大核心 2010年第22期175-177,183,共4页 Computer Engineering and Applications
关键词 数据流 数据挖掘 滑动窗口 频繁模式 data streams data mining sliding window frequent pattern
  • 相关文献

参考文献11

  • 1Babcock B,Babu S,Datar M, et al.Models and issues in data stream systems[C]//Proc of the 21st ACM Symposium on Principles of Database Systems,PODS,2002 : 1-16.
  • 2Li Hua-fu,Lee S Y.Mining frequent itemsets over data streams using efficient window sliding techniques[J].Expert Systems with Applications, 2007( 11 ).
  • 3Zhu Y,Shasha D.StatStream:Statistical monitoring of thousands of data streams in real time[C]//Proc 28th Int Conf on Very Large Data Bases, Hong Kong, China, 2002 : 358-369.
  • 4刘学军,徐宏炳,董逸生,王永利,钱江波.挖掘数据流中的频繁模式[J].计算机研究与发展,2005,42(12):2192-2198. 被引量:25
  • 5Chang J H,Lee W S,Zhou A.Finding recent frequent itemsets adaptively over online data streams[C]//Proc of 9th ACM SIG- KDD Int'l Conf on Knowledge Discovery and Data Mining, August 2003.
  • 6Giannella C, Han Jia-wei, Pei Jian, et al.Mining frequent patterns in data streams at multiple time granularities[C]//Proc of the NSF Workshop on Next Generation Data Mining,2002.
  • 7Cheng J, Ke Yi-ping,Ng W.Maintaining frequent itemsets over high-speed data streams[C]//Proc of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAK- DD2006), Singapore, 2006: 462-467.
  • 8Lin C H, Chiu D Y, Wu Y H, et al.Mining frequent itemsets from data streams with a time-sensitive sliding window[C]// Proc of the 5th SIAM Intemational on Data Mining, Newport Beach, USA, 2005.
  • 9Chang J H, Lee W S.A sliding window method for finding recently frequent itemsets over online data streams[J].Journal of Information Science and Engineering,2004,20(4):753-762.
  • 10HanJia-wei,MichelineK,范明,孟小峰,译.数据挖掘概念与技术[M]2版.北京:机械工业出版社,2001.

二级参考文献25

  • 1C. Giannella, J. Han, J. Pei, et al. Mining frequent patterns in data streams at multiple time granularities. In: H. Kargupta, A.Joshi, K. Sivakumar, eds. Next Generation Data Mining.Cambridge, Massachusetts: MIT Press, 2003. 191-212.
  • 2G.S. Manku, R. Motwani. Approximate frequency counts over streaming data. The 28th Int'l Conf. Very Large Data Bases(VLDB 2002), Hong Kong, 2002.
  • 3宋国杰 王腾蛟 唐世渭.数据流中频繁模式的评估与维护[A]..第20届全国数据库学术会议[C].长沙,2003..
  • 4R.M. Karp, C. H. Papadimitriou, S. Shenker. A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Database Systems, 2003, 28 (1): 51 - 55.
  • 5M. Charikar, K. Chen, M. Farach-Colton. Finding frequent items in data streams. The 29th Int'l Colloquium on Automata,Languages and Programming, Malaga, Spain, 2002.
  • 6Joong Hyuk Chang, Won Suk Lee. Finding recent frequent itemsets adaptively over online data streams. The 9th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD 03), Washington, D. C, 2003.
  • 7Wei-Guang Teng, Ming-Syan Chen, Philip S. Yu. A regressionbased temporal pattern mining scheme for data streams. The Int'l Conf. Very Large Data Bases, Berlin, Germany, 2003.
  • 8Graham Cormode, Flip Korn, S. Muthukrishnan, et al. Finding hierarchical heavy hitters in data streams. The Int'l Conf. Very Large Data Bases (VLDB) 2003, Berlin, Germany, 2003.
  • 9Tatsuya Asai, Hiroki Arimura, Kenji Abe, et al. Online algorithms for mining semi-structured data stream. The IEEE Int'l Conf. Data Mining (ICDM) 2002, Maebashi City, Japan,2002.
  • 10Graham Cormode, S. Muthukrishnan. What' s hot and what's not: Tracking most frequent items dynamically. The ACM Symposium on Principles of Database Systems (PODS) 2003, San Diego, CA, USA, 2003.

共引文献54

同被引文献18

  • 1潘云鹤,王金龙,徐从富.数据流频繁模式挖掘研究进展[J].自动化学报,2006,32(4):594-602. 被引量:34
  • 2HAN J W, CHENG H, XIN D, et al. Frequent pattern mining:current status and future directions [ J]. Data Mining and Knowl-edge Discovery, 2007,15(1) :55 -86.
  • 3CHANG L, WANG T J, YANG D Q, et al. SeqStream: miningclosed sequential patterns over stream sliding windows [ C]// Pro-ceedings of the 2008 IEEE International Conference on Data Min-ing. Piscateway: IEEE, 2008: 83 -92.
  • 4CARSON K L,QUAMRUL I. DSTree: a tree structure for the min-ing of frequent sets from data streams[ C] // Proceedings of the 2006IEEE Internationa] Conference on Data Mining. Hong Kong; IEEE,2006 : 928 -932.
  • 5BARZAN M, HETAL T,CARLO Z. Verifying and mining frequentpatterns from large windows over data streams[ C]// Proceedings ofthe 24th IEEE International Conference on Data Enigneering. Pisca-taway: IEEE, 2008:179 -188.
  • 6LI H F, SUH-YIN L. Mining frequent itemsets over data streams u-sing efficient window sliding techniques[ J]. Expert Systems with Ap-plications, 2007, 36(2):1466 - 1477.
  • 7SYEDKT, CHOWDHURY F A, BYEONG-SOO J, et al. Slidingwindow-based frequent pattern mining over data streams [ JJ. Infor-mation Sciences, 2009, 179(22) : 3843 -3865.
  • 8CHANG J,LEE W. A sliding window method for finding recentlyfrequent itemsets over online data streams [ J]. Journal of InformationScience and Engineering. 2004, 20(4) :753 -762.
  • 9HAN J W, PEI J, YIN Y W. Mining frequent patterns withoutcandidate generation[ C] // Proceedings of the 2000 ACM SIGMODInternational Conference on Management of Data. New York:ACM, 2000, 29(2):1 -12.
  • 10SYED K T,CHOWDHURY F A,BYEONG-SOO J, et al. CP-Tree: a tree structure for single-pass frequent pattern mining[ C] //Advances in Knowledge Discovery and Data Mining: Proceedings of12th PKDDD, LNCS 5012. Osaka: Springer, 2008:1022-1027.

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部