摘要
本文对基于分布式的演化数据流的连续异常检测问题进行了形式化描述,提出一种在滑动窗口中基于张量分解的异常检测算法——WSTA。该算法将各分布结点上的数据流作为全局数据流的子张量,通过分布结点与中心节点的通信,在分布结点的滑动窗口中自适应抽样生成概要数据结构矩阵。对该数据矩阵进行张量分解得到特征向量,然后采用基于距离的异常检测方法发现异常点。基于大量真实数据集的实验表明,此算法具有良好的适用性和可扩展性。
This paper formalizes the problem of continuous anomaly detection over distributed evolving data streams. A novel anomaly detection algorithm of tensor analysis over the sliding window of the distributed streams is presented, which is named WSTA. The data stream on every distributed node is taken for a sub-tensor of the global data stream, based on the communication of distribution information between the distributed nodes and the central node, and can produce the synopsis data structure matrix through adaptive sampling on every distributed node's sliding window. The tensor decomposition is used to extract the distribution feature of the sliced data. Then anomaly can be found by using the distance-based anomaly detection method. Our experiments with synthetic data show that the proposed method is both efficient and scalability compared with the existing anomaly detection algorithms.
出处
《计算机工程与科学》
CSCD
北大核心
2009年第6期75-78,共4页
Computer Engineering & Science
基金
985工程二期项目
关键词
异常检测
分布数据流
滑动窗口
张量分解
自适应抽样
anomaly detection
distributed data streams
sliding window
tensor decomposition
adaptive sampling