High-quality data is essential for the success of data-driven learning tasks.The characteristics,precision,and completeness of the datasets critically determine the reliability,interpretability,and effectiveness of su...High-quality data is essential for the success of data-driven learning tasks.The characteristics,precision,and completeness of the datasets critically determine the reliability,interpretability,and effectiveness of subsequent analyzes and applications,such as fault detection,predictive maintenance,and process optimization.However,for many industrial processes,obtaining sufficient high-quality data remains a significant challenge due to high costs,safety concerns,and practical constraints.To overcome these challenges,data augmentation has emerged as a rapidly growing research area,attracting considerable attention across both academia and industry.By expanding datasets,data augmentation techniques improve greater generalization and more robust performance in actual applications.This paper provides a comprehensive,multi-perspective review of data augmentation methods for industrial processes.For clarity and organization,existing studies are systematically grouped into four categories:small sample with low dimension,small sample with high dimension,large sample with low dimension,and large sample with high dimension.Within this framework,the review examines current research from both methodological and application-oriented perspectives,highlighting main methods,advantages,and limitations.By synthesizing these findings,this review offers a structured overview for scholars and practitioners,serving as a valuable reference for newcomers and experienced researchers seeking to explore and advance data augmentation techniques in industrial processes.展开更多
基金supported by the Postdoctoral Fellowship Program(Grade B)of China(GZB20250435)the National Natural Science Foundation of China(62403270).
文摘High-quality data is essential for the success of data-driven learning tasks.The characteristics,precision,and completeness of the datasets critically determine the reliability,interpretability,and effectiveness of subsequent analyzes and applications,such as fault detection,predictive maintenance,and process optimization.However,for many industrial processes,obtaining sufficient high-quality data remains a significant challenge due to high costs,safety concerns,and practical constraints.To overcome these challenges,data augmentation has emerged as a rapidly growing research area,attracting considerable attention across both academia and industry.By expanding datasets,data augmentation techniques improve greater generalization and more robust performance in actual applications.This paper provides a comprehensive,multi-perspective review of data augmentation methods for industrial processes.For clarity and organization,existing studies are systematically grouped into four categories:small sample with low dimension,small sample with high dimension,large sample with low dimension,and large sample with high dimension.Within this framework,the review examines current research from both methodological and application-oriented perspectives,highlighting main methods,advantages,and limitations.By synthesizing these findings,this review offers a structured overview for scholars and practitioners,serving as a valuable reference for newcomers and experienced researchers seeking to explore and advance data augmentation techniques in industrial processes.