摘要
作为数据流挖掘的一个重要研究问题,滑动窗口下的数据流频繁模式挖掘近年来得到了广泛应用和研究。已有的算法大多要对数据流中所有的数据都进行处理,而现实中用户往往只关注事物的某些方面,由此借鉴MFI-TransSW算法,提出了一种基于事务型滑动窗口的算法BSW-Filter(Bit Sliding Window with Filter)。算法采用比特序列实现滑动窗口操作,同时由于增加了频繁项的筛选,减少了所需保存的数据项个数,从而减小了内存使用和提升处理速度。算法的空间复杂度与滑动窗口大小以及数据流取值范围无关,特别适用于周期较长数据范围广的数据挖掘。分析和实验验证了该算法的可行性和有效性。
As one of the most important problems in data stream mining,the frequent patterns mining with a sliding win- dow is widely researched and used in many fields.Exiting algorithms need process all elements in the data stream, whereas users only focus on several aspects of things.So inspired by the MFI-TransSW algorithm,a new algorithm based on transac- tion-sensitive sliding window is proposed in this paper, in which a sequence of bits is used to implement the sliding win- dow operation.In addition, a mechanism of filtering frequent items, which decreases the memory usage and improve the effi- ciency of processing, because of the reduction of items retained in memory.Furthermore as space complexity is independent to the size of sliding window and the value range of elements, this method is specially applicable to discovery of data with a wide range of values in a long period.The analysis and experiments show the feasibility and effectiveness of the algorithm.
出处
《计算机工程与应用》
CSCD
北大核心
2010年第22期175-177,183,共4页
Computer Engineering and Applications
关键词
数据流
数据挖掘
滑动窗口
频繁模式
data streams
data mining
sliding window
frequent pattern