The rapid digitalization of the energy sector has led to the deployment of large-scale smart metering systems that generate high-frequency time series data,creating new opportunities and challenges for energy anomaly ...The rapid digitalization of the energy sector has led to the deployment of large-scale smart metering systems that generate high-frequency time series data,creating new opportunities and challenges for energy anomaly detection.Accurate identification of anomalous patterns in building energy consumption is essential for optimizing operations,improving energy efficiency,and supporting grid reliability.This study investigates advanced feature engineering and machine learning modeling techniques for large-scale time series anomaly detection in building energy systems.Expanding upon previous benchmark frameworks,we introduce additional features such as oil price indices and solar cycle indicators,including sunset and sunrise times,to enhance the contextual understanding of consumption patterns.Our comparative modeling approach encompasses an extensive suite of algorithms,including KNeighborsUnif,KNeighborsDist,LightGBMXT,LightGBM,RandomForestMSE,CatBoost,ExtraTreesMSE,NeuralNetFastAI,XGBoost,NeuralNetTorch,and LightGBMLarge.Data preprocessing includes rigorous handling of missing values and normalization,while feature engineering focuses on temporal,environmental,and value-change attributes.The models are evaluated on a comprehensive dataset of smart meter readings,with performance assessed using metrics such as the Area Under the Receiver Operating Characteristic Curve(AUC-ROC).The results demonstrate that the integration of diverse exogenous variables and a hybrid ensemble of traditional tree-based and neural network models can significantly improve anomaly detection performance.This work provides new insights into the design of robust,scalable,and generalizable frameworks for energy anomaly detection in complex,real-world settings.展开更多
文摘The rapid digitalization of the energy sector has led to the deployment of large-scale smart metering systems that generate high-frequency time series data,creating new opportunities and challenges for energy anomaly detection.Accurate identification of anomalous patterns in building energy consumption is essential for optimizing operations,improving energy efficiency,and supporting grid reliability.This study investigates advanced feature engineering and machine learning modeling techniques for large-scale time series anomaly detection in building energy systems.Expanding upon previous benchmark frameworks,we introduce additional features such as oil price indices and solar cycle indicators,including sunset and sunrise times,to enhance the contextual understanding of consumption patterns.Our comparative modeling approach encompasses an extensive suite of algorithms,including KNeighborsUnif,KNeighborsDist,LightGBMXT,LightGBM,RandomForestMSE,CatBoost,ExtraTreesMSE,NeuralNetFastAI,XGBoost,NeuralNetTorch,and LightGBMLarge.Data preprocessing includes rigorous handling of missing values and normalization,while feature engineering focuses on temporal,environmental,and value-change attributes.The models are evaluated on a comprehensive dataset of smart meter readings,with performance assessed using metrics such as the Area Under the Receiver Operating Characteristic Curve(AUC-ROC).The results demonstrate that the integration of diverse exogenous variables and a hybrid ensemble of traditional tree-based and neural network models can significantly improve anomaly detection performance.This work provides new insights into the design of robust,scalable,and generalizable frameworks for energy anomaly detection in complex,real-world settings.