Driven by the improvement of the smart grid,the active distribution network(ADN)has attracted much attention due to its characteristic of active management.By making full use of electricity price signals for optimal s...Driven by the improvement of the smart grid,the active distribution network(ADN)has attracted much attention due to its characteristic of active management.By making full use of electricity price signals for optimal scheduling,the total cost of the ADN can be reduced.However,the optimal dayahead scheduling problem is challenging since the future electricity price is unknown.Moreover,in ADN,some schedulable variables are continuous while some schedulable variables are discrete,which increases the difficulty of determining the optimal scheduling scheme.In this paper,the day-ahead scheduling problem of the ADN is formulated as a Markov decision process(MDP)with continuous-discrete hybrid action space.Then,an algorithm based on multi-agent hybrid reinforcement learning(HRL)is proposed to obtain the optimal scheduling scheme.The proposed algorithm adopts the structure of centralized training and decentralized execution,and different methods are applied to determine the selection policy of continuous scheduling variables and discrete scheduling variables.The simulation experiment results demonstrate the effectiveness of the algorithm.展开更多
Unmanned Aerial Vehicle(UAV)plays a prominent role in various fields,and autonomous navigation is a crucial component of UAV intelligence.Deep Reinforcement Learning(DRL)has expanded the research avenues for addressin...Unmanned Aerial Vehicle(UAV)plays a prominent role in various fields,and autonomous navigation is a crucial component of UAV intelligence.Deep Reinforcement Learning(DRL)has expanded the research avenues for addressing challenges in autonomous navigation.Nonetheless,challenges persist,including getting stuck in local optima,consuming excessive computations during action space exploration,and neglecting deterministic experience.This paper proposes a noise-driven enhancement strategy.In accordance with the overall learning phases,a global noise control method is designed,while a differentiated local noise control method is developed by analyzing the exploration demands of four typical situations encountered by UAV during navigation.Both methods are integrated into a dual-model for noise control to regulate action space exploration.Furthermore,noise dual experience replay buffers are designed to optimize the rational utilization of both deterministic and noisy experience.In uncertain environments,based on the Twin Delay Deep Deterministic Policy Gradient(TD3)algorithm with Long Short-Term Memory(LSTM)network and Priority Experience Replay(PER),a Noise-Driven Enhancement Priority Memory TD3(NDE-PMTD3)is developed.We established a simulation environment to compare different algorithms,and the performance of the algorithms is analyzed in various scenarios.The training results indicate that the proposed algorithm accelerates the convergence speed and enhances the convergence stability.In test experiments,the proposed algorithm successfully and efficiently performs autonomous navigation tasks in diverse environments,demonstrating superior generalization results.展开更多
The proliferation of distributed energy resources and time-varying network topologies in active distribution networks presents unprecedented challenges for network operators.While reinforcement learning (RL) has shown...The proliferation of distributed energy resources and time-varying network topologies in active distribution networks presents unprecedented challenges for network operators.While reinforcement learning (RL) has shown promise in addressing network-constrained energy scheduling,it faces difficulties in managing the complexities of dynamic topologies and discrete-continuous hybrid action spaces.To address these challenges,a graph-based safe RL approach is proposed to learn dynamic optimal power flow under time-varying network topologies.This proposed approach leverages graph convolution operators to handle network topology changes,while safe RL with parameterized action ensures policy development.Specifically,the graph convolution operator abstracts key characteristics of the network topology,enabling effective power flow management in non-stationary environments.Besides that,a parameterized action constrained Markov decision process is employed to handle the hybrid action space and ensure compliance with physical network constraints,thereby accelerating the deployment of safe policy for hybrid action spaces.Numerical results demonstrate that the proposed approach efficiently navigates the discrete-continuous decision space while accounting for the constraints imposed by the dynamic nature of power flow in time-varying network topologies.展开更多
In physical information theory elementary objects are represented as correlation structures with oscillator properties and characterized by action. The procedure makes it possible to describe the photons of positive a...In physical information theory elementary objects are represented as correlation structures with oscillator properties and characterized by action. The procedure makes it possible to describe the photons of positive and negative charges by positive and negative real action;gravitons are represented in equal amounts by positive and negative real, i.e., virtual action, and the components of the vacuum are characterized by deactivated virtual action. An analysis of the currents in the correlation structures of photons of static Maxwell fields with wave and particle properties, of the Maxwell vacuum and of the gravitons leads to a uniform three-dimensional representation of the structure of the action. Based on these results, a basic structure consisting of a system of oscillators is proposed, which describe the properties of charges and masses and interact with the photons of static Maxwell fields and with gravitons. All properties of the elemental components of nature can thus be traced back to a basic structure of action. It follows that nature can be derived from a uniform structure and this structure of action must therefore also be the basis of the origin of the cosmos.展开更多
Segal’s chronometric theory is based on a space-time D, which might be viewed as a Lie group with a causal structure defined by an invariant Lorentzian form on the Lie algebra u(2). Similarly, the space-time F is rea...Segal’s chronometric theory is based on a space-time D, which might be viewed as a Lie group with a causal structure defined by an invariant Lorentzian form on the Lie algebra u(2). Similarly, the space-time F is realized as the Lie group with a causal structure defined by an invariant Lorentzian form on u(1,1). Two Lie groups G, GF are introduced as representations of SU(2,2): they are related via conjugation by a certain matrix Win Gl(4). The linear-fractional action of G on D is well-known to be global, conformal, and it plays a crucial role in the analysis on space-time bundles carried out by Paneitz and Segal in the 1980’s. This analysis was based on the parallelizing group U(2). In the paper, singularities’ general (“geometric”) description of the linear-fractional conformal GF-action on F is given and specific examples are presented. The results call for the analysis of space-time bundles based on U(1,1) as the parallelizing group. Certain key stages of such an analysis are suggested.展开更多
基金This work was supported by the National Key R&D Program of China(2018AAA0101400)the National Natural Science Foundation of China(62173251,61921004,U1713209)the Natural Science Foundation of Jiangsu Province of China(BK20202006).
文摘Driven by the improvement of the smart grid,the active distribution network(ADN)has attracted much attention due to its characteristic of active management.By making full use of electricity price signals for optimal scheduling,the total cost of the ADN can be reduced.However,the optimal dayahead scheduling problem is challenging since the future electricity price is unknown.Moreover,in ADN,some schedulable variables are continuous while some schedulable variables are discrete,which increases the difficulty of determining the optimal scheduling scheme.In this paper,the day-ahead scheduling problem of the ADN is formulated as a Markov decision process(MDP)with continuous-discrete hybrid action space.Then,an algorithm based on multi-agent hybrid reinforcement learning(HRL)is proposed to obtain the optimal scheduling scheme.The proposed algorithm adopts the structure of centralized training and decentralized execution,and different methods are applied to determine the selection policy of continuous scheduling variables and discrete scheduling variables.The simulation experiment results demonstrate the effectiveness of the algorithm.
基金the Collaborative Innovation Project of Shanghai,China for the financial support。
文摘Unmanned Aerial Vehicle(UAV)plays a prominent role in various fields,and autonomous navigation is a crucial component of UAV intelligence.Deep Reinforcement Learning(DRL)has expanded the research avenues for addressing challenges in autonomous navigation.Nonetheless,challenges persist,including getting stuck in local optima,consuming excessive computations during action space exploration,and neglecting deterministic experience.This paper proposes a noise-driven enhancement strategy.In accordance with the overall learning phases,a global noise control method is designed,while a differentiated local noise control method is developed by analyzing the exploration demands of four typical situations encountered by UAV during navigation.Both methods are integrated into a dual-model for noise control to regulate action space exploration.Furthermore,noise dual experience replay buffers are designed to optimize the rational utilization of both deterministic and noisy experience.In uncertain environments,based on the Twin Delay Deep Deterministic Policy Gradient(TD3)algorithm with Long Short-Term Memory(LSTM)network and Priority Experience Replay(PER),a Noise-Driven Enhancement Priority Memory TD3(NDE-PMTD3)is developed.We established a simulation environment to compare different algorithms,and the performance of the algorithms is analyzed in various scenarios.The training results indicate that the proposed algorithm accelerates the convergence speed and enhances the convergence stability.In test experiments,the proposed algorithm successfully and efficiently performs autonomous navigation tasks in diverse environments,demonstrating superior generalization results.
基金supported by the Tianjin Science and Technology Program(No.22JCZDJC00820).
文摘The proliferation of distributed energy resources and time-varying network topologies in active distribution networks presents unprecedented challenges for network operators.While reinforcement learning (RL) has shown promise in addressing network-constrained energy scheduling,it faces difficulties in managing the complexities of dynamic topologies and discrete-continuous hybrid action spaces.To address these challenges,a graph-based safe RL approach is proposed to learn dynamic optimal power flow under time-varying network topologies.This proposed approach leverages graph convolution operators to handle network topology changes,while safe RL with parameterized action ensures policy development.Specifically,the graph convolution operator abstracts key characteristics of the network topology,enabling effective power flow management in non-stationary environments.Besides that,a parameterized action constrained Markov decision process is employed to handle the hybrid action space and ensure compliance with physical network constraints,thereby accelerating the deployment of safe policy for hybrid action spaces.Numerical results demonstrate that the proposed approach efficiently navigates the discrete-continuous decision space while accounting for the constraints imposed by the dynamic nature of power flow in time-varying network topologies.
文摘In physical information theory elementary objects are represented as correlation structures with oscillator properties and characterized by action. The procedure makes it possible to describe the photons of positive and negative charges by positive and negative real action;gravitons are represented in equal amounts by positive and negative real, i.e., virtual action, and the components of the vacuum are characterized by deactivated virtual action. An analysis of the currents in the correlation structures of photons of static Maxwell fields with wave and particle properties, of the Maxwell vacuum and of the gravitons leads to a uniform three-dimensional representation of the structure of the action. Based on these results, a basic structure consisting of a system of oscillators is proposed, which describe the properties of charges and masses and interact with the photons of static Maxwell fields and with gravitons. All properties of the elemental components of nature can thus be traced back to a basic structure of action. It follows that nature can be derived from a uniform structure and this structure of action must therefore also be the basis of the origin of the cosmos.
文摘Segal’s chronometric theory is based on a space-time D, which might be viewed as a Lie group with a causal structure defined by an invariant Lorentzian form on the Lie algebra u(2). Similarly, the space-time F is realized as the Lie group with a causal structure defined by an invariant Lorentzian form on u(1,1). Two Lie groups G, GF are introduced as representations of SU(2,2): they are related via conjugation by a certain matrix Win Gl(4). The linear-fractional action of G on D is well-known to be global, conformal, and it plays a crucial role in the analysis on space-time bundles carried out by Paneitz and Segal in the 1980’s. This analysis was based on the parallelizing group U(2). In the paper, singularities’ general (“geometric”) description of the linear-fractional conformal GF-action on F is given and specific examples are presented. The results call for the analysis of space-time bundles based on U(1,1) as the parallelizing group. Certain key stages of such an analysis are suggested.