Driven by the improvement of the smart grid,the active distribution network(ADN)has attracted much attention due to its characteristic of active management.By making full use of electricity price signals for optimal s...Driven by the improvement of the smart grid,the active distribution network(ADN)has attracted much attention due to its characteristic of active management.By making full use of electricity price signals for optimal scheduling,the total cost of the ADN can be reduced.However,the optimal dayahead scheduling problem is challenging since the future electricity price is unknown.Moreover,in ADN,some schedulable variables are continuous while some schedulable variables are discrete,which increases the difficulty of determining the optimal scheduling scheme.In this paper,the day-ahead scheduling problem of the ADN is formulated as a Markov decision process(MDP)with continuous-discrete hybrid action space.Then,an algorithm based on multi-agent hybrid reinforcement learning(HRL)is proposed to obtain the optimal scheduling scheme.The proposed algorithm adopts the structure of centralized training and decentralized execution,and different methods are applied to determine the selection policy of continuous scheduling variables and discrete scheduling variables.The simulation experiment results demonstrate the effectiveness of the algorithm.展开更多
In volt/var control(VVC)for active distribution networks,it is essential to integrate traditional voltage regulation devices with modern smart photovoltaic inverters to prevent voltage violations.However,model-based m...In volt/var control(VVC)for active distribution networks,it is essential to integrate traditional voltage regulation devices with modern smart photovoltaic inverters to prevent voltage violations.However,model-based multi-device VVC methods rely on accurate system models for decision-making,which can be challenging due to the extensive modeling workload.To tackle the complexities of multi-device cooperation in VVC,this paper proposes a two-timescale VVC method based on reinforcement learning with hybrid action space,termed the hybrid action representation twin delayed deep deterministic policy gradient(HAR-TD3)method.This method simultaneously manages traditional discrete voltage regulation devices,which operate on a slower timescale,and smart continuous voltage regulation devices,which function on a faster timescale.To enable effective collaboration between the different action spaces of these devices,we propose a variational auto-encoder based hybrid action reconstruction network.This network captures the interdependencies of hybrid actions by embedding both discrete and continuous actions into the latent representation space and subsequently decoding them for action reconstruction.The proposed method is validated on IEEE 33-bus,69-bus,and 123-bus distribution networks.Numerical results indicate that the proposed method successfully coordinates discrete and continuous voltage regulation devices,achieving fewer voltage violations compared with stateof-the-art reinforcement learning methods.展开更多
This paper develops deep reinforcement learning(DRL)algorithms for optimizing the operation of home energy system which consists of photovoltaic(PV)panels,battery energy storage system,and household appliances.Model-f...This paper develops deep reinforcement learning(DRL)algorithms for optimizing the operation of home energy system which consists of photovoltaic(PV)panels,battery energy storage system,and household appliances.Model-free DRL algorithms can efficiently handle the difficulty of energy system modeling and uncertainty of PV generation.However,discretecontinuous hybrid action space of the considered home energy system challenges existing DRL algorithms for either discrete actions or continuous actions.Thus,a mixed deep reinforcement learning(MDRL)algorithm is proposed,which integrates deep Q-learning(DQL)algorithm and deep deterministic policy gradient(DDPG)algorithm.The DQL algorithm deals with discrete actions,while the DDPG algorithm handles continuous actions.The MDRL algorithm learns optimal strategy by trialand-error interactions with the environment.However,unsafe actions,which violate system constraints,can give rise to great cost.To handle such problem,a safe-MDRL algorithm is further proposed.Simulation studies demonstrate that the proposed MDRL algorithm can efficiently handle the challenge from discrete-continuous hybrid action space for home energy management.The proposed MDRL algorithm reduces the operation cost while maintaining the human thermal comfort by comparing with benchmark algorithms on the test dataset.Moreover,the safe-MDRL algorithm greatly reduces the loss of thermal comfort in the learning stage by the proposed MDRL algorithm.展开更多
基金This work was supported by the National Key R&D Program of China(2018AAA0101400)the National Natural Science Foundation of China(62173251,61921004,U1713209)the Natural Science Foundation of Jiangsu Province of China(BK20202006).
文摘Driven by the improvement of the smart grid,the active distribution network(ADN)has attracted much attention due to its characteristic of active management.By making full use of electricity price signals for optimal scheduling,the total cost of the ADN can be reduced.However,the optimal dayahead scheduling problem is challenging since the future electricity price is unknown.Moreover,in ADN,some schedulable variables are continuous while some schedulable variables are discrete,which increases the difficulty of determining the optimal scheduling scheme.In this paper,the day-ahead scheduling problem of the ADN is formulated as a Markov decision process(MDP)with continuous-discrete hybrid action space.Then,an algorithm based on multi-agent hybrid reinforcement learning(HRL)is proposed to obtain the optimal scheduling scheme.The proposed algorithm adopts the structure of centralized training and decentralized execution,and different methods are applied to determine the selection policy of continuous scheduling variables and discrete scheduling variables.The simulation experiment results demonstrate the effectiveness of the algorithm.
基金supported in part by the National Science and Technology Major Project(No.2022ZD0116900)the National Natural Science Foundation of China(No.52277118)the Natural Science Foundation of Tianjin(No.22JCZDJC00660).
文摘In volt/var control(VVC)for active distribution networks,it is essential to integrate traditional voltage regulation devices with modern smart photovoltaic inverters to prevent voltage violations.However,model-based multi-device VVC methods rely on accurate system models for decision-making,which can be challenging due to the extensive modeling workload.To tackle the complexities of multi-device cooperation in VVC,this paper proposes a two-timescale VVC method based on reinforcement learning with hybrid action space,termed the hybrid action representation twin delayed deep deterministic policy gradient(HAR-TD3)method.This method simultaneously manages traditional discrete voltage regulation devices,which operate on a slower timescale,and smart continuous voltage regulation devices,which function on a faster timescale.To enable effective collaboration between the different action spaces of these devices,we propose a variational auto-encoder based hybrid action reconstruction network.This network captures the interdependencies of hybrid actions by embedding both discrete and continuous actions into the latent representation space and subsequently decoding them for action reconstruction.The proposed method is validated on IEEE 33-bus,69-bus,and 123-bus distribution networks.Numerical results indicate that the proposed method successfully coordinates discrete and continuous voltage regulation devices,achieving fewer voltage violations compared with stateof-the-art reinforcement learning methods.
基金supported by the National Natural Science Foundation of China(No.62002016)the Science and Technology Development Fund,Macao S.A.R.(No.0137/2019/A3)+1 种基金the Beijing Natural Science Foundation(No.9204028)the Guangdong Basic and Applied Basic Research Foundation(No.2019A1515111165)。
文摘This paper develops deep reinforcement learning(DRL)algorithms for optimizing the operation of home energy system which consists of photovoltaic(PV)panels,battery energy storage system,and household appliances.Model-free DRL algorithms can efficiently handle the difficulty of energy system modeling and uncertainty of PV generation.However,discretecontinuous hybrid action space of the considered home energy system challenges existing DRL algorithms for either discrete actions or continuous actions.Thus,a mixed deep reinforcement learning(MDRL)algorithm is proposed,which integrates deep Q-learning(DQL)algorithm and deep deterministic policy gradient(DDPG)algorithm.The DQL algorithm deals with discrete actions,while the DDPG algorithm handles continuous actions.The MDRL algorithm learns optimal strategy by trialand-error interactions with the environment.However,unsafe actions,which violate system constraints,can give rise to great cost.To handle such problem,a safe-MDRL algorithm is further proposed.Simulation studies demonstrate that the proposed MDRL algorithm can efficiently handle the challenge from discrete-continuous hybrid action space for home energy management.The proposed MDRL algorithm reduces the operation cost while maintaining the human thermal comfort by comparing with benchmark algorithms on the test dataset.Moreover,the safe-MDRL algorithm greatly reduces the loss of thermal comfort in the learning stage by the proposed MDRL algorithm.