期刊文献+

基于张量分解的数据流异常检测

Anomaly Detection in Data Streams Based on Tensors Analysis
在线阅读 下载PDF
导出
摘要 本文对基于分布式的演化数据流的连续异常检测问题进行了形式化描述,提出一种在滑动窗口中基于张量分解的异常检测算法——WSTA。该算法将各分布结点上的数据流作为全局数据流的子张量,通过分布结点与中心节点的通信,在分布结点的滑动窗口中自适应抽样生成概要数据结构矩阵。对该数据矩阵进行张量分解得到特征向量,然后采用基于距离的异常检测方法发现异常点。基于大量真实数据集的实验表明,此算法具有良好的适用性和可扩展性。 This paper formalizes the problem of continuous anomaly detection over distributed evolving data streams. A novel anomaly detection algorithm of tensor analysis over the sliding window of the distributed streams is presented, which is named WSTA. The data stream on every distributed node is taken for a sub-tensor of the global data stream, based on the communication of distribution information between the distributed nodes and the central node, and can produce the synopsis data structure matrix through adaptive sampling on every distributed node's sliding window. The tensor decomposition is used to extract the distribution feature of the sliced data. Then anomaly can be found by using the distance-based anomaly detection method. Our experiments with synthetic data show that the proposed method is both efficient and scalability compared with the existing anomaly detection algorithms.
出处 《计算机工程与科学》 CSCD 北大核心 2009年第6期75-78,共4页 Computer Engineering & Science
基金 985工程二期项目
关键词 异常检测 分布数据流 滑动窗口 张量分解 自适应抽样 anomaly detection distributed data streams sliding window tensor decomposition adaptive sampling
  • 相关文献

参考文献5

  • 1宋国杰,唐世渭,杨冬青,王腾蛟.数据流中异常模式的提取与趋势监测[J].计算机研究与发展,2004,41(10):1754-1759. 被引量:19
  • 2遇辉,马秀莉,谭少华,唐世渭,杨冬青.基于奇异值分解的异常切片挖掘[J].软件学报,2005,16(7):1282-1288. 被引量:6
  • 3Mahoney M W, Maggioni M, Drineas P. Tensor-CUR Decompositions for Tensor-Based Data[C]//Proc of KDD' 06, 2006:327 336.
  • 4Knorr E M, Ng R T. Algorithms for Mining Distance-Based Outliers in Large Datasets[C]//Proc of VLDB'98,1998 : 392- 403.
  • 5Ghoting A, Parthasarathy S, Otey M E. Fast Mining of Distance-Based Outliers in High Dimensional Datasets[J].Data Mining and Knowledge 2008,16(3) : 349-364.

二级参考文献14

  • 1Rakesh Agrawal, Ramakrishnan Srikant. Fast algorithms for mining association rules. The 20th Int' l Conf on Very Large Data Bases, Santiago, Chile, 1994
  • 2J Han, J Pei, Y Yin. Mining frequent Patterns without candidate generation. In: Proc of the 2000 ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 2000
  • 3Ramakrishnan Srikant, Rakesh Agrawal. Mining sequential patterns: Generalizations and performance improvements. In:Peter M GApers, Mokrane Bouzeghoub, Georges Gardarin, eds.In: Proc of the 5th Int'l Conf Extending Database Technology,LNCS 1057. Berlin: Springer-Verlag, 1996. 3~17
  • 4J Pei, J Han, B Mortazavi-Asl, et al. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth.The 2001 Int'l Conf on Data Engineering (ICDE' 01 ),Heidelberg, Germany, 2001
  • 5Imielinski T, Khachiyan L, Abdulghani A. Cubegrades: Generalizing association rules. In: Proc. of the 8th Int'l Conf. on Data Mining and Knowledge Discovery. Edmonton: ACM Press, 2002. 219-257.
  • 6Lakshmanan VS, Pei J, Han JW. Quotient cube: How to summarize the semantics of a data cube. In: Proc. of the 28th Int'l Conf. on Very Large Data Bases. Hong Kong: Morgan Kaufmann Publishers, 2002. 778-789.
  • 7Sarawagi S, Agrawal R, Megiddo N. Discovery-Driven exploration of OLAP data cubes. In: Proc. of the Int'l Conf. on Extending Database Technology. LNCS 1377, Springer-Verlag, 1998. 168-182.
  • 8Sarawagi S. Explaining differences in multidimensional aggregates. In: Proc. of the 25th Int'l Conf. on Very Large Data Bases. Edinburgh: Morgan Kaufmann Publishers, 1999. 42-53.
  • 9Sarawagi S. User-Adaptive exploration of multidimensional data. In: Proc. of the 26th Int'l Conf. on Very Large Data Bases. Cairo: Morgan Kaufmann Publishers, 2000. 307-316.
  • 10Sathe G, Sarawagi S. Intelligent rollups in multidimensional OLAP data. In: Proc. of the 27th Int'l Conf. on Very Large Data Bases. Roma: Morgan Kaufmann Publishers, 2001. 531-540.

共引文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部