In this paper,we present a Q-Learning optimization algorithm for smart home HVAC systems.The proposed algorithm combines new convex deep neural network models with model predictive control(MPC)techniques.More specific...In this paper,we present a Q-Learning optimization algorithm for smart home HVAC systems.The proposed algorithm combines new convex deep neural network models with model predictive control(MPC)techniques.More specifically,new input convex long short-term memory(ICLSTM)models are employed to predict dynamic states in an MPC optimal control technique integrated within a Q-Learning reinforcement learning(RL)algorithm to further improve the learned temporal behaviors of nonlinear HVAC systems.As a novel RL approach,the proposed algorithm generates day-ahead HVAC demand response(DR)signals in smart homes that optimally reduce and/or shift peak energy usage,reduce electricity costs,minimize user discomfort,and honor in a best-effort way the recommendations from utility/aggregator,which in turn has impact on the overall well being of the distribution network controlled by the aggregator.The proposed Q-Learning optimization algorithm,based on epsilon-model predictive control(e-MPC),can be implemented as a control agent that is executed by the smart house energy management(SHEM)system that we assume exists in the smart home,which can interact with the energy provider of the distribution network,i.e.,utility/aggregator,via the smart meter.The output generated by the proposed control agent represents day-ahead local DR signals in the form of temperature setpoints for the HVAC system that are found by the optimization process to lead to desired trade-offs between electricity cost and user discomfort.The proposed algorithm can be used in smart homes with passive HVAC controllers,which solely react to end-user setpoints,to transform them into smart homes with active HVAC controllers.Such systems not only respond to the preferences of the end-user but also incorporate an external control signal provided by the utility or aggregator.Simulation experiments conducted with a custom simulation tool demonstrate that the proposed optimization framework can offer significant benefits.It achieves 87%higher success rate in optimizing setpoints in the desired range,thereby resulting in up to 15%energy savings and zero temperature discomfort.展开更多
基金supported by an award from the National Science Foundation,USA grant ECCF 1936494..
文摘In this paper,we present a Q-Learning optimization algorithm for smart home HVAC systems.The proposed algorithm combines new convex deep neural network models with model predictive control(MPC)techniques.More specifically,new input convex long short-term memory(ICLSTM)models are employed to predict dynamic states in an MPC optimal control technique integrated within a Q-Learning reinforcement learning(RL)algorithm to further improve the learned temporal behaviors of nonlinear HVAC systems.As a novel RL approach,the proposed algorithm generates day-ahead HVAC demand response(DR)signals in smart homes that optimally reduce and/or shift peak energy usage,reduce electricity costs,minimize user discomfort,and honor in a best-effort way the recommendations from utility/aggregator,which in turn has impact on the overall well being of the distribution network controlled by the aggregator.The proposed Q-Learning optimization algorithm,based on epsilon-model predictive control(e-MPC),can be implemented as a control agent that is executed by the smart house energy management(SHEM)system that we assume exists in the smart home,which can interact with the energy provider of the distribution network,i.e.,utility/aggregator,via the smart meter.The output generated by the proposed control agent represents day-ahead local DR signals in the form of temperature setpoints for the HVAC system that are found by the optimization process to lead to desired trade-offs between electricity cost and user discomfort.The proposed algorithm can be used in smart homes with passive HVAC controllers,which solely react to end-user setpoints,to transform them into smart homes with active HVAC controllers.Such systems not only respond to the preferences of the end-user but also incorporate an external control signal provided by the utility or aggregator.Simulation experiments conducted with a custom simulation tool demonstrate that the proposed optimization framework can offer significant benefits.It achieves 87%higher success rate in optimizing setpoints in the desired range,thereby resulting in up to 15%energy savings and zero temperature discomfort.