The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep r...The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep reinforcement learning(DRL)methods have recently been proposed to address the HVAC control problem.However,the application of single-agent DRL formulti-zone residential HVAC controlmay lead to non-convergence or slow convergence.In this paper,we propose MAQMC(Multi-Agent deep Q-network for multi-zone residential HVAC Control)to address this challenge with the goal of minimizing energy consumption while maintaining occupants’thermal comfort.MAQMC is divided into MAQMC2(MAQMC with two agents:one agent controls the temperature of each zone,and the other agent controls the humidity of each zone)and MAQMC3(MAQMC with three agents:three agents control the temperature and humidity of three zones,respectively).The experimental results showthatMAQMC3 can reduce energy consumption by 6.27%andMAQMC2 by 3.73%compared with the fixed point;compared with the rule-based,MAQMC3 andMAQMC2 respectively can reduce 61.89%and 59.07%comfort violation.In addition,experiments with different regional weather data demonstrate that the well-trained MAQMC RL agents have the robustness and adaptability to unknown environments.展开更多
Aiming at optimizing the energy consumption of HVAC,an energy conservation optimization method was proposed for HVAC systems based on the sensitivity analysis(SA),named the sensitivity analysis combination method(SAC)...Aiming at optimizing the energy consumption of HVAC,an energy conservation optimization method was proposed for HVAC systems based on the sensitivity analysis(SA),named the sensitivity analysis combination method(SAC).Based on the SA,neural network and the related settings about energy conservation of HVAC systems,such as cooling water temperature,chilled water temperature and supply air temperature,were optimized.Moreover,based on the data of the existing HVAC system,various optimal control methods ofHVAC systems were tested and evaluated by a simulated HVAC system in TRNSYS.The results show that the proposed SA combination method can reduce significant computational load while maintaining an equivalent energy performance compared with traditional optimal control methods.展开更多
Meta-learning has been widely applied to solving few-shot reinforcement learning problems,where we hope to obtain an agent that can learn quickly in a new task.However,these algorithms often ignore some isolated tasks...Meta-learning has been widely applied to solving few-shot reinforcement learning problems,where we hope to obtain an agent that can learn quickly in a new task.However,these algorithms often ignore some isolated tasks in pursuit of the average performance,which may result in negative adaptation in these isolated tasks,and they usually need sufficient learning in a stationary task distribution.In this paper,our algorithm presents a hierarchical framework of double meta-learning,and the whole framework includes classification,meta-learning,and re-adaptation.Firstly,in the classification process,we classify tasks into several task subsets,considered as some categories of tasks,by learned parameters of each task,which can separate out some isolated tasks thereafter.Secondly,in the meta-learning process,we learn category parameters in all subsets via meta-learning.Simultaneously,based on the gradient of each category parameter in each subset,we use meta-learning again to learn a new metaparameter related to the whole task set,which can be used as an initial parameter for the new task.Finally,in the re-adaption process,we adapt the parameter of the new task with two steps,by the meta-parameter and the appropriate category parameter successively.Experimentally,we demonstrate our algorithm prevents the agent from negative adaptation without losing the average performance for the whole task set.Additionally,our algorithm presents a more rapid adaptation process within readaptation.Moreover,we show the good performance of our algorithm with fewer samples as the agent is exposed to an online meta-learning setting.展开更多
基金supported by Primary Research and Development Plan of China(No.2020YFC2006602)National Natural Science Foundation of China(Nos.62072324,61876217,61876121,61772357)+2 种基金University Natural Science Foundation of Jiangsu Province(No.21KJA520005)Primary Research and Development Plan of Jiangsu Province(No.BE2020026)Natural Science Foundation of Jiangsu Province(No.BK20190942).
文摘The optimization of multi-zone residential heating,ventilation,and air conditioning(HVAC)control is not an easy task due to its complex dynamic thermal model and the uncertainty of occupant-driven cooling loads.Deep reinforcement learning(DRL)methods have recently been proposed to address the HVAC control problem.However,the application of single-agent DRL formulti-zone residential HVAC controlmay lead to non-convergence or slow convergence.In this paper,we propose MAQMC(Multi-Agent deep Q-network for multi-zone residential HVAC Control)to address this challenge with the goal of minimizing energy consumption while maintaining occupants’thermal comfort.MAQMC is divided into MAQMC2(MAQMC with two agents:one agent controls the temperature of each zone,and the other agent controls the humidity of each zone)and MAQMC3(MAQMC with three agents:three agents control the temperature and humidity of three zones,respectively).The experimental results showthatMAQMC3 can reduce energy consumption by 6.27%andMAQMC2 by 3.73%compared with the fixed point;compared with the rule-based,MAQMC3 andMAQMC2 respectively can reduce 61.89%and 59.07%comfort violation.In addition,experiments with different regional weather data demonstrate that the well-trained MAQMC RL agents have the robustness and adaptability to unknown environments.
基金supported by National Key R&D Program of China(No.2020YFC2006602)National Natural Science Foundation of China(Nos.62072324,61876217,61876121,61772357)+1 种基金University Natural Science Foundation of Jiangsu Province(No.21KJA520005)Primary Research and Development Plan of Jiangsu Province(No.BE2020026).
文摘Aiming at optimizing the energy consumption of HVAC,an energy conservation optimization method was proposed for HVAC systems based on the sensitivity analysis(SA),named the sensitivity analysis combination method(SAC).Based on the SA,neural network and the related settings about energy conservation of HVAC systems,such as cooling water temperature,chilled water temperature and supply air temperature,were optimized.Moreover,based on the data of the existing HVAC system,various optimal control methods ofHVAC systems were tested and evaluated by a simulated HVAC system in TRNSYS.The results show that the proposed SA combination method can reduce significant computational load while maintaining an equivalent energy performance compared with traditional optimal control methods.
基金financially supported by the National Key R&D Program of China(2020YFC2006602)the National Natural Science Foundation of China(Grant Nos.62072324,61876217,61876121,61772357)+3 种基金University Natural Science Foundation of Jiangsu Province(No.21KJA520005)Primary Research and Development Plan of Jiangsu Province(BE2020026)Natural ScienceFoundationof Jiangsu Province(BK20190942)Postgraduate Research&Practice Innovation Program of Jiangsu Province(No.KYCX21_3020).
文摘Meta-learning has been widely applied to solving few-shot reinforcement learning problems,where we hope to obtain an agent that can learn quickly in a new task.However,these algorithms often ignore some isolated tasks in pursuit of the average performance,which may result in negative adaptation in these isolated tasks,and they usually need sufficient learning in a stationary task distribution.In this paper,our algorithm presents a hierarchical framework of double meta-learning,and the whole framework includes classification,meta-learning,and re-adaptation.Firstly,in the classification process,we classify tasks into several task subsets,considered as some categories of tasks,by learned parameters of each task,which can separate out some isolated tasks thereafter.Secondly,in the meta-learning process,we learn category parameters in all subsets via meta-learning.Simultaneously,based on the gradient of each category parameter in each subset,we use meta-learning again to learn a new metaparameter related to the whole task set,which can be used as an initial parameter for the new task.Finally,in the re-adaption process,we adapt the parameter of the new task with two steps,by the meta-parameter and the appropriate category parameter successively.Experimentally,we demonstrate our algorithm prevents the agent from negative adaptation without losing the average performance for the whole task set.Additionally,our algorithm presents a more rapid adaptation process within readaptation.Moreover,we show the good performance of our algorithm with fewer samples as the agent is exposed to an online meta-learning setting.