Patterned-based time series segmentation (PTSS) is an important task for many time series data mining applications. In this paper, according to the characteristics of PTSS, a generalized model is proposed for PTSS. Fi...Patterned-based time series segmentation (PTSS) is an important task for many time series data mining applications. In this paper, according to the characteristics of PTSS, a generalized model is proposed for PTSS. First, a new inter-pretation for PTSS is given by comparing this problem with the prototype-based clustering (PC). Then, a novel model, called clustering-inverse model (CI-model), is presented. Finally, two algorithms are presented to implement this model. Our experimental results on artificial and real-world time series demonstrate that the proposed algorithms are quite effective.展开更多
In high-risk industrial environments like nuclear power plants,precise defect identification and localization are essential for maintaining production stability and safety.However,the complexity of such a harsh enviro...In high-risk industrial environments like nuclear power plants,precise defect identification and localization are essential for maintaining production stability and safety.However,the complexity of such a harsh environment leads to significant variations in the shape and size of the defects.To address this challenge,we propose the multivariate time series segmentation network(MSSN),which adopts a multiscale convolutional network with multi-stage and depth-separable convolutions for efficient feature extraction through variable-length templates.To tackle the classification difficulty caused by structural signal variance,MSSN employs logarithmic normalization to adjust instance distributions.Furthermore,it integrates classification with smoothing loss functions to accurately identify defect segments amid similar structural and defect signal subsequences.Our algorithm evaluated on both the Mackey-Glass dataset and industrial dataset achieves over 95%localization and demonstrates the capture capability on the synthetic dataset.In a nuclear plant's heat transfer tube dataset,it captures 90%of defect instances with75%middle localization F1 score.展开更多
Time series segmentation has attracted more interests in recent years,which aims to segment time series into different segments,each reflects a state of the monitored objects.Although there have been many surveys on t...Time series segmentation has attracted more interests in recent years,which aims to segment time series into different segments,each reflects a state of the monitored objects.Although there have been many surveys on time series segmentation,most of them focus more on change point detection(CPD)methods and overlook the advances in boundary detection(BD)and state detection(SD)methods.In this paper,we categorize time series segmentation methods into CPD,BD,and SD methods,with a specific focus on recent advances in BD and SD methods.Within the scope of BD and SD,we subdivide the methods based on their underlying models/techniques and focus on the milestones that have shaped the development trajectory of each category.As a conclusion,we found that:(1)Existing methods failed to provide sufficient support for online working,with only a few methods supporting online deployment;(2)Most existing methods require the specification of parameters,which hinders their ability to work adaptively;(3)Existing SD methods do not attach importance to accurate detection of boundary points in evaluation,which may lead to limitations in boundary point detection.We highlight the ability to working online and adaptively as important attributes of segmentation methods,the boundary detection accuracy as a neglected metrics for SD methods.展开更多
Multivariate time series segmentation is an important problem in data mining and it has arisen in more and more practical applications in recent years.The task of time series segmentation is to partition a time series...Multivariate time series segmentation is an important problem in data mining and it has arisen in more and more practical applications in recent years.The task of time series segmentation is to partition a time series into segments by detecting the abrupt changes or anomalies in the time series.Multivariate time series segmentation can provide meaningful information for further data analysis,prediction and policy decision.A time series can be considered as a piecewise continuous function,it is natural to take its total variation norm as a prior information of this time series.In this paper,by minimizing the negative log-likelihood function of a time series,we propose a total variation based model for multivariate time series segmentation.An iterative process is applied to solve the proposed model and a search combined the dynamic programming method is designed to determine the breakpoints.The experimental results show that the proposed method is efficient for multivariate time series segmentation and it is competitive to the existing methods for multivariate time series segmentation.展开更多
Aiming at the problem of ignoring the importance of starting point features of trajecory segmentation in existing trajectory compression algorithms,a study was conducted on the preprocessing process of trajectory time...Aiming at the problem of ignoring the importance of starting point features of trajecory segmentation in existing trajectory compression algorithms,a study was conducted on the preprocessing process of trajectory time series.Firstly,an algorithm improvement was proposed based on the segmentation algorithm GRASP-UTS(Greedy Randomized Adaptive Search Procedure for Unsupervised Trajectory Segmentation).On the basis of considering trajectory coverage,this algorithm designs an adaptive parameter adjustment to segment long-term trajectory data reasonably and the identification of an optimal starting point for segmentation.Then the compression efficiency of typical offline and online algorithms,such as the Douglas-Peucker algorithm,the Sliding Window algorithm and its enhancements,was compared before and after segmentation.The experimental findings highlight that the Adaptive Parameters GRASP-UTS segmentation approach leads to higher fitting precision in trajectory time series compression and improved algorithm efficiency post-segmentation.Additionally,the compression performance of the Improved Sliding Window algorithm post-segmentation showcases its suitability for trajectories of varying scales,providing reasonable compression accuracy.展开更多
The symbolic representation of time series has attracted much research interest recently. The high dimensionality typical of the data is challenging, especially as the time series becomes longer. The wide distribution...The symbolic representation of time series has attracted much research interest recently. The high dimensionality typical of the data is challenging, especially as the time series becomes longer. The wide distribution of sensors collecting more and more data exacerbates the problem. Representing a time series effectively is an essential task for decision-making activities such as classification, prediction, and knowledge discovery. In this paper, we propose a new symbolic representation method for long time series based on trend features, called trend feature symbolic approximation (TFSA). The method uses a two-step mechanism to segment long time series rapidly. Unlike some previous symbolic methods, it focuses on retaining most of the trend features and patterns of the original series. A time series is represented by trend symbols, which are also suitable for use in knowledge discovery, such as association rules mining. TFSA provides the lower bounding guarantee. Experimental results show that, compared with some previous methods, it not only has better segmentation efficiency and classification accuracy, but also is applicable for use in knowledge discovery from time series.展开更多
Time series segmentation aims to extract some meaningful subsequences from complex temporal information.A proper segmentation can effectively help users to analyze the structure of time series.In this study,we propose...Time series segmentation aims to extract some meaningful subsequences from complex temporal information.A proper segmentation can effectively help users to analyze the structure of time series.In this study,we propose an information granulation-based fuzzy clustering method for the problem of time series segmentation.The suggested time series segmentation method follows the technological procedure of fuzzy c-means clustering method.First,the original time series is randomly divided into several segments.Then,an information granulation-based dynamic time warping approach is designed to update the series centers,where the principle of reasonable granularity is utilized to calculate the mean of the segments.Next,the time series segments are clustered by optimizing the objective function.Finally,the optimal segmentation points are generated by merging the contiguous segments in the same cluster.The experimental results show that the established segmentation method has more advantages than the existing segmentation methods.展开更多
文摘Patterned-based time series segmentation (PTSS) is an important task for many time series data mining applications. In this paper, according to the characteristics of PTSS, a generalized model is proposed for PTSS. First, a new inter-pretation for PTSS is given by comparing this problem with the prototype-based clustering (PC). Then, a novel model, called clustering-inverse model (CI-model), is presented. Finally, two algorithms are presented to implement this model. Our experimental results on artificial and real-world time series demonstrate that the proposed algorithms are quite effective.
基金supported by the National Science and Technology Major Project of the Ministry of Science and Technology of China(2024ZD0608100)the National Natural Science Foundation of China(62332017,U22A2022)
文摘In high-risk industrial environments like nuclear power plants,precise defect identification and localization are essential for maintaining production stability and safety.However,the complexity of such a harsh environment leads to significant variations in the shape and size of the defects.To address this challenge,we propose the multivariate time series segmentation network(MSSN),which adopts a multiscale convolutional network with multi-stage and depth-separable convolutions for efficient feature extraction through variable-length templates.To tackle the classification difficulty caused by structural signal variance,MSSN employs logarithmic normalization to adjust instance distributions.Furthermore,it integrates classification with smoothing loss functions to accurately identify defect segments amid similar structural and defect signal subsequences.Our algorithm evaluated on both the Mackey-Glass dataset and industrial dataset achieves over 95%localization and demonstrates the capture capability on the synthetic dataset.In a nuclear plant's heat transfer tube dataset,it captures 90%of defect instances with75%middle localization F1 score.
基金This work is supported by the National Key Research and Development Program of China(2022YFF1203001)National Natural Science Foundation of China(Nos.62072465,62102425)the Science and Technology Innovation Program of Hunan Province(Nos.2022RC3061,2023RC3027).
文摘Time series segmentation has attracted more interests in recent years,which aims to segment time series into different segments,each reflects a state of the monitored objects.Although there have been many surveys on time series segmentation,most of them focus more on change point detection(CPD)methods and overlook the advances in boundary detection(BD)and state detection(SD)methods.In this paper,we categorize time series segmentation methods into CPD,BD,and SD methods,with a specific focus on recent advances in BD and SD methods.Within the scope of BD and SD,we subdivide the methods based on their underlying models/techniques and focus on the milestones that have shaped the development trajectory of each category.As a conclusion,we found that:(1)Existing methods failed to provide sufficient support for online working,with only a few methods supporting online deployment;(2)Most existing methods require the specification of parameters,which hinders their ability to work adaptively;(3)Existing SD methods do not attach importance to accurate detection of boundary points in evaluation,which may lead to limitations in boundary point detection.We highlight the ability to working online and adaptively as important attributes of segmentation methods,the boundary detection accuracy as a neglected metrics for SD methods.
基金This work is supported by the National Natural Science Foundation of China Nos.11971215,11871210,and 11971214the Key Laboratory of Applied Mathematics and Complex Systems of Lanzhou University.
文摘Multivariate time series segmentation is an important problem in data mining and it has arisen in more and more practical applications in recent years.The task of time series segmentation is to partition a time series into segments by detecting the abrupt changes or anomalies in the time series.Multivariate time series segmentation can provide meaningful information for further data analysis,prediction and policy decision.A time series can be considered as a piecewise continuous function,it is natural to take its total variation norm as a prior information of this time series.In this paper,by minimizing the negative log-likelihood function of a time series,we propose a total variation based model for multivariate time series segmentation.An iterative process is applied to solve the proposed model and a search combined the dynamic programming method is designed to determine the breakpoints.The experimental results show that the proposed method is efficient for multivariate time series segmentation and it is competitive to the existing methods for multivariate time series segmentation.
基金Supported by the Basic Research Projects of Liaoning Provincial Department of Education(LJKQZ20222459)。
文摘Aiming at the problem of ignoring the importance of starting point features of trajecory segmentation in existing trajectory compression algorithms,a study was conducted on the preprocessing process of trajectory time series.Firstly,an algorithm improvement was proposed based on the segmentation algorithm GRASP-UTS(Greedy Randomized Adaptive Search Procedure for Unsupervised Trajectory Segmentation).On the basis of considering trajectory coverage,this algorithm designs an adaptive parameter adjustment to segment long-term trajectory data reasonably and the identification of an optimal starting point for segmentation.Then the compression efficiency of typical offline and online algorithms,such as the Douglas-Peucker algorithm,the Sliding Window algorithm and its enhancements,was compared before and after segmentation.The experimental findings highlight that the Adaptive Parameters GRASP-UTS segmentation approach leads to higher fitting precision in trajectory time series compression and improved algorithm efficiency post-segmentation.Additionally,the compression performance of the Improved Sliding Window algorithm post-segmentation showcases its suitability for trajectories of varying scales,providing reasonable compression accuracy.
基金supported by the National High-Tech R&D Program(863)of China(Nos.2012AA012600,2011AA010702,2012AA01A401,and 2012AA01A402)the National Natural Science Foundation of China(No.60933005)the National Science and Technology of China(No.2012BAH38B04)
文摘The symbolic representation of time series has attracted much research interest recently. The high dimensionality typical of the data is challenging, especially as the time series becomes longer. The wide distribution of sensors collecting more and more data exacerbates the problem. Representing a time series effectively is an essential task for decision-making activities such as classification, prediction, and knowledge discovery. In this paper, we propose a new symbolic representation method for long time series based on trend features, called trend feature symbolic approximation (TFSA). The method uses a two-step mechanism to segment long time series rapidly. Unlike some previous symbolic methods, it focuses on retaining most of the trend features and patterns of the original series. A time series is represented by trend symbols, which are also suitable for use in knowledge discovery, such as association rules mining. TFSA provides the lower bounding guarantee. Experimental results show that, compared with some previous methods, it not only has better segmentation efficiency and classification accuracy, but also is applicable for use in knowledge discovery from time series.
基金supported by the National Natural Science Foundation of China(62173053,62006071,62006033)the Science and Technology Project of Science and Technology Department of Henan Province(242102210016,212102210149)+1 种基金the National Natural Science Foundation of Henan Province(252300421806)the Backbone Teacher Training Program of Henan University of Technology.
文摘Time series segmentation aims to extract some meaningful subsequences from complex temporal information.A proper segmentation can effectively help users to analyze the structure of time series.In this study,we propose an information granulation-based fuzzy clustering method for the problem of time series segmentation.The suggested time series segmentation method follows the technological procedure of fuzzy c-means clustering method.First,the original time series is randomly divided into several segments.Then,an information granulation-based dynamic time warping approach is designed to update the series centers,where the principle of reasonable granularity is utilized to calculate the mean of the segments.Next,the time series segments are clustered by optimizing the objective function.Finally,the optimal segmentation points are generated by merging the contiguous segments in the same cluster.The experimental results show that the established segmentation method has more advantages than the existing segmentation methods.