The transition towards a more sustainable environment requires the development of new control systems on the demand side to integrate renewable energy sources into the energy systems.For this purpose,energy meter data...The transition towards a more sustainable environment requires the development of new control systems on the demand side to integrate renewable energy sources into the energy systems.For this purpose,energy meter data of homes have been broadly used in modelling,forecast and optimal control of energy use.However,usability and reliability of household energy meter data have not been specifically addressed.In this study,we apply commonly used machine learning methods on the heating consumption data of(1)two individual homes in an apartment building and(2)the district heating substation of the apartment building which includes 72 homes,to identify how the characteristics of data affect the result of data analysis.Two clustering approaches were applied using the K-means algorithm to group similar heating daily profiles.Using the clustering results,different classification algorithms such as logistic regression and random forest were applied to predict the heating consumption level with regards to the weather conditions.The data analysis process showed that the substation data which is the aggregated heating consumption of the 72 homes is more reliable and valid for energy prediction than the data from two individual homes.This is due to the large variation and uncertainty in the daily energy use of individual homes.展开更多
Continuously publishing histograms in data streams is crucial to many real-time applications,as it provides not only critical statistical information,but also reduces privacy leaking risk.As the importance of elements...Continuously publishing histograms in data streams is crucial to many real-time applications,as it provides not only critical statistical information,but also reduces privacy leaking risk.As the importance of elements usually decreases over time in data streams,in this paper we model a data stream by a sequence of weighted sliding windows,and then study how to publish histograms over these windows continuously.The existing literature can hardly solve this problem in a real-time way,because they need to buffer all elements in each sliding window,resulting in high computational overhead and prohibitive storage burden.In this paper,we overcome this drawback by proposing an online algorithm denoted by Efficient Streaming Histogram Publishing(ESHP)to continuously publish histograms over weighted sliding windows.Specifically,our method first creates a novel sketching structure,called Approximate-Estimate Sketch(AESketch),to maintain the counting information of each histogram interval at every time instance;then,it creates histograms that satisfy the differential privacy requirement by smartly adding appropriate noise values into the sketching structure.Extensive experimental results and rigorous theoretical analysis demonstrate that the ESHP method can offer equivalent data utility with significantly lower computational overhead and storage costs when compared to other existing methods.展开更多
文摘The transition towards a more sustainable environment requires the development of new control systems on the demand side to integrate renewable energy sources into the energy systems.For this purpose,energy meter data of homes have been broadly used in modelling,forecast and optimal control of energy use.However,usability and reliability of household energy meter data have not been specifically addressed.In this study,we apply commonly used machine learning methods on the heating consumption data of(1)two individual homes in an apartment building and(2)the district heating substation of the apartment building which includes 72 homes,to identify how the characteristics of data affect the result of data analysis.Two clustering approaches were applied using the K-means algorithm to group similar heating daily profiles.Using the clustering results,different classification algorithms such as logistic regression and random forest were applied to predict the heating consumption level with regards to the weather conditions.The data analysis process showed that the substation data which is the aggregated heating consumption of the 72 homes is more reliable and valid for energy prediction than the data from two individual homes.This is due to the large variation and uncertainty in the daily energy use of individual homes.
基金supported by the Program for Synergy Innovation in the Anhui Higher Education Institutions of China(No.GXXT-2020-012)the National Natural Science Foundation of China(No.62172003)+2 种基金the Anhui Provincial Natural Science Foundation(No.2108085MF218)the Anhui Province University Natural Science Research Project(No.2022AH040052)the Science and Technology Innovation Program of Ma’anshan,China(No.2021a120009).
文摘Continuously publishing histograms in data streams is crucial to many real-time applications,as it provides not only critical statistical information,but also reduces privacy leaking risk.As the importance of elements usually decreases over time in data streams,in this paper we model a data stream by a sequence of weighted sliding windows,and then study how to publish histograms over these windows continuously.The existing literature can hardly solve this problem in a real-time way,because they need to buffer all elements in each sliding window,resulting in high computational overhead and prohibitive storage burden.In this paper,we overcome this drawback by proposing an online algorithm denoted by Efficient Streaming Histogram Publishing(ESHP)to continuously publish histograms over weighted sliding windows.Specifically,our method first creates a novel sketching structure,called Approximate-Estimate Sketch(AESketch),to maintain the counting information of each histogram interval at every time instance;then,it creates histograms that satisfy the differential privacy requirement by smartly adding appropriate noise values into the sketching structure.Extensive experimental results and rigorous theoretical analysis demonstrate that the ESHP method can offer equivalent data utility with significantly lower computational overhead and storage costs when compared to other existing methods.