期刊文献+

基于时间窗口权值的数据流分类算法

Data stream classification algorithm using time window weighting
原文传递
导出
摘要 针对不同时间段的数据流对当前数据流分类影响程度不同,在滑动窗口技术基础上提出了时间窗口权值的频繁模式(TWWFP)分类算法.首先,对滑动窗口中的每个基本窗口赋予一个与时间有关的窗口权值;然后,采用TWWFP-Tree结构存储当前滑动窗口中每个基本窗口中的频繁数据属性,实时更新TWWFP-Tree结构;最后,检测相邻3个滑动窗口中权值属性的平均分类误差,发现突变后及时减少下一个滑动窗口的长度可适应数据流的变化.实验证明该分类算法比没有时间窗口权值分类算法的精确度最大提高3%. Data stream classification algorithm was proposed using time window weighed frequent patterns(TWWFP) based on sliding window technology.The algorithm improves classification precision and the ability of accommodating the mutational data stream.First,every basic window of sliding window was endow with time window weighting.Second,the frequent data attributes in basing window were stored in TWWFP-Tree that update in time.Finally,it showed that the length of the sliding window will be reduced to adapt the data stream changing by inspecting the average classification error in the vicinity of the two sliding window.Experiments show that the precision of classification algorithm with time window weight was improved by 3% than that with no time window weighting.
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2011年第1期41-44,共4页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金 国防科技预研基金资助项目(08J3.74)
关键词 数据流 滑动窗口 时间窗口权值 频繁模式 窗口突变 data stream sliding windows time window weighting frequent patterns window mutation
  • 相关文献

参考文献13

  • 1Yang Yiming, Lin Xin. A re-examination of text categorization methods[C]//22nd Annual International ACM SIGIR Conference on Research and Develop- ment in the Information Retrieval. New York: ACM Press, 1999: 42-49.
  • 2Duda R O, Hart P E. Pattern classification and scene analysis[M]. New York: Wiley, 1973.
  • 3Quinlan J R. Induction on decision trees[J]. Machine Learning, 1986, 13(1): 81-106.
  • 4Quinlan J R. C4. 5: programs for machine learning [M]. San Mateo: Morgan Kaufmann, 1993.
  • 5Agrawal R,Srikant R. Fast algorithms for mining association rules [C]//Proceedings of the 1994 Very Large Data Bases. Santiago de: Morgan Kanfmann, 1994: 487-499.
  • 6侯俊杰,李春平.一种基于模式增长的频繁模式挖掘算法[J].华中科技大学学报(自然科学版),2005,33(z1):272-274. 被引量:1
  • 7Hulten G, Spencer L, Domingos P. Mining time- changing data streams[C]//Proc of the Int'l Conf on Knowledge Discovery and Data Mining. New York: ACM Press, 2001.97-106.
  • 8Maron O, Moore A. Hoeffding races: accelerating model selection search for classifieation and function approximation[J]. Advances in Neural Information Processing Systems. 1993, 6: 59-66.
  • 9Wang Haixun, Wei Fan, Yu P S, et al. Mining con- cept-drifting data streams using ensemble classifiers [C]//Proc of the Int'l Conf on Knowledge Discovery and Data Mining. New York: ACM Press, 2003: 226- 235.
  • 10Lee C H, Lin C R, Chen M S. Sliding-window filtering: an efficient method for incremental mining on a time variant database[J]. Inform System, 2005, 30 (3): 227-244.

二级参考文献18

  • 1[1]Agrawal R,Imielinski T,Swami A.Mining association rules between sets of items in large databases[A].Proc 1993 ACM-SIGMOD Int'l Conf on Management of Data (SIGMOD'93)[C].Washington,1993.207-216
  • 2[2]HanJ P,Yin Y.Mining frequent patterns without candidate generation[A].Proc of the 2000 ACM SIGMOD International Conference on Management of Data[C].ACM Press,2000,Volume 29(2) of SIGMOD Record:1-12
  • 3[3]Zheng Zijian,Ron Kohavi,LIew Mason.Real world performance of association rule algorithms[A].Proc of the seventh ACM SIGKDD international conference on Knowledge Discovery and Data Mining [C].ACM Press,2001.401-406
  • 4[4]http:// fimi.cs.helsinki.fi/data/.
  • 5[5]http://www.almaden.ibm.com/cs/quest/syndata/html.
  • 6Wei Guang Teng,Ming-Syan Chen,Philip S Yu.A regression-based temporal pattern mining scheme for data streams[C].The Int'l Conf on Very Large Data Bases(VLDB 2003),Berlin,Germany,2003
  • 7Graham Cormode,Flip Korn,S Muthukrishnan,et al.Finding hierarchical heavy hitters in data streams[C].The Int'l Conf on Very Large Data Bases (VLDB 2003),Berlin,Germany,2003
  • 8Graham Cormode,S Muthukrishnan.What's hot and What's not:Tracking most frequent items dynamically[C].The ACM Symp on Principles of Database Systems (PODS 2003),San Diego,CA,USA,2003
  • 9C Sirish,M J Franklin.Streaming queries over streaming data[C].The 28th Int'l Conf on Very Large Data Bases,Hong Kong,2002
  • 10C Giannella,J Han,J Pei,et al.Mining frequent patterns in data streams at multiple time granularities[G].In:H Kargupta,A Joshi,K Sivakumar,et al,eds.Next Generation Data Mining.Cambridge,Mass:MIT Press,2003

共引文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部