摘要
对于时间序列聚类任务而言,一个有效的距离度量至关重要.为了提高时间序列聚类的性能,考虑借助度量学习方法,从数据中学习一种适用于时序聚类的距离度量.然而,现有的度量学习未注意到时序的特性,且时间序列数据存在成对约束等辅助信息不易获取的问题.提出一种辅助信息自动生成的时间序列距离度量学习(distance metric learning based on side information autogeneration for time series,简称SIADML)方法.该方法利用动态时间弯曲(dynamic time warping,简称DTW)距离在捕捉时序特性上的优势,自动生成成对约束信息,使习得的度量尽可能地保持时序之间固有的近邻关系.在一系列时间序列标准数据集上的实验结果表明,采用该方法得到的度量能够有效改善时间序列聚类的性能.
An effective distance metric is essential for time series clustering. To improve the performance of time series clustering, various methods of metric learning can be applied to generate a proper distance metric from the data. However, the existing metric learning methods overlook the characteristics of time series. And for time series, it is difficult to obtain side information, such as pairwise constraints, for metric learning. In this paper, a method for distance metric learning based on side information autogeneration for time series (SIADML) is proposed. In this method, dynamic time warping (DTW) distance is used to measure the similarity between two time series and generate pairwise constraints automatically. The metric which is learned from the pairwise constraints can preserve the neighbor relationship of time series as much as possible. Experimental results on benchmark datasets demonstrate that the proposed method can effectively improve the performance for time series clustering.
出处
《软件学报》
EI
CSCD
北大核心
2013年第11期2642-2655,共14页
Journal of Software
基金
国家自然科学基金(61139002)
关键词
度量学习
动态时间弯曲
辅助信息自动生成
时间序列聚类
metric learning
dynamic time warping
side information autogeneration
time series clustering