期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
A Low-Collision and Efficient Grasping Method for Manipulator Based on Safe Reinforcement Learning
1
作者 Qinglei Zhang Bai Hu +2 位作者 Jiyun Qin Jianguo Duan Ying Zhou 《Computers, Materials & Continua》 2025年第4期1257-1273,共17页
Grasping is one of the most fundamental operations in modern robotics applications.While deep rein-forcement learning(DRL)has demonstrated strong potential in robotics,there is too much emphasis on maximizing the cumu... Grasping is one of the most fundamental operations in modern robotics applications.While deep rein-forcement learning(DRL)has demonstrated strong potential in robotics,there is too much emphasis on maximizing the cumulative reward in executing tasks,and the potential safety risks are often ignored.In this paper,an optimization method based on safe reinforcement learning(Safe RL)is proposed to address the robotic grasping problem under safety constraints.Specifically,considering the obstacle avoidance constraints of the system,the grasping problem of the manipulator is modeled as a Constrained Markov Decision Process(CMDP).The Lagrange multiplier and a dynamic weighted mechanism are introduced into the Proximal Policy Optimization(PPO)framework,leading to the development of the dynamic weighted Lagrange PPO(DWL-PPO)algorithm.The behavior of violating safety constraints is punished while the policy is optimized in this proposed method.In addition,the orientation control of the end-effector is included in the reward function,and a compound reward function adapted to changes in pose is designed.Ultimately,the efficacy and advantages of the suggested method are proved by extensive training and testing in the Pybullet simulator.The results of grasping experiments reveal that the recommended approach provides superior safety and efficiency compared with other advanced RL methods and achieves a good trade-off between model learning and risk aversion. 展开更多
关键词 safe reinforcement learning(safe RL) manipulator grasping obstacle avoidance constraints lagrange multiplier dynamic weighted
在线阅读 下载PDF
Safe Reinforcement Learning for Grid-forming Inverter Based Frequency Regulation with Stability Guarantee
2
作者 Hang Shuai Buxin She +1 位作者 Jinning Wang Fangxing Li 《Journal of Modern Power Systems and Clean Energy》 2025年第1期79-86,共8页
This study investigates a safe reinforcement learning algorithm for grid-forming(GFM)inverter based frequency regulation.To guarantee the stability of the inverter-based resource(IBR)system under the learned control p... This study investigates a safe reinforcement learning algorithm for grid-forming(GFM)inverter based frequency regulation.To guarantee the stability of the inverter-based resource(IBR)system under the learned control policy,a modelbased reinforcement learning(MBRL)algorithm is combined with Lyapunov approach,which determines the safe region of states and actions.To obtain near optimal control policy,the control performance is safely improved by approximate dynamic programming(ADP)using data sampled from the region of attraction(ROA).Moreover,to enhance the control robustness against parameter uncertainty in the inverter,a Gaussian process(GP)model is adopted by the proposed algorithm to effectively learn system dynamics from measurements.Numerical simulations validate the effectiveness of the proposed algorithm. 展开更多
关键词 Inverter-based resource(IBR) virtual synchronous generator(VSG) safe reinforcement learning Lyapunov function frequency regulation grid-forming inverter
原文传递
DistFlow Safe Reinforcement Learning Algorithm for Voltage Magnitude Regulation in Distribution Networks
3
作者 Shengren Hou Aihui Fu +3 位作者 Edgar Mauricio Salazar Duque Peter Palensky Qixin Chen Pedro P.Vergara 《Journal of Modern Power Systems and Clean Energy》 2025年第1期300-311,共12页
The integration of distributed energy resources(DERs)has escalated the challenge of voltage magnitude regulation in distribution networks.Model-based approaches,which rely on complex sequential mathematical formulatio... The integration of distributed energy resources(DERs)has escalated the challenge of voltage magnitude regulation in distribution networks.Model-based approaches,which rely on complex sequential mathematical formulations,cannot meet the real-time demand.Deep reinforcement learning(DRL)offers an alternative by utilizing offline training with distribution network simulators and then executing online without computation.However,DRL algorithms fail to enforce voltage magnitude constraints during training and testing,potentially leading to serious operational violations.To tackle these challenges,we introduce a novel safe-guaranteed reinforcement learning algorithm,the Dist Flow safe reinforcement learning(DF-SRL),designed specifically for real-time voltage magnitude regulation in distribution networks.The DF-SRL algorithm incorporates a Dist Flow linearization to construct an expert-knowledge-based safety layer.Subsequently,the DF-SRL algorithm overlays this safety layer on top of the agent policy,recalibrating unsafe actions to safe domains through a quadratic programming formulation.Simulation results show the DF-SRL algorithm consistently ensures voltage magnitude constraints during training and real-time operation(test)phases,achieving faster convergence and higher performance,which differentiates it apart from(safe)DRL benchmark algorithms. 展开更多
关键词 Voltage regulation distribution network safe reinforcement learning energy management
原文传递
Mixed Deep Reinforcement Learning Considering Discrete-continuous Hybrid Action Space for Smart Home Energy Management 被引量:5
4
作者 Chao Huang Hongcai Zhang +2 位作者 Long Wang Xiong Luo Yonghua Song 《Journal of Modern Power Systems and Clean Energy》 SCIE EI CSCD 2022年第3期743-754,共12页
This paper develops deep reinforcement learning(DRL)algorithms for optimizing the operation of home energy system which consists of photovoltaic(PV)panels,battery energy storage system,and household appliances.Model-f... This paper develops deep reinforcement learning(DRL)algorithms for optimizing the operation of home energy system which consists of photovoltaic(PV)panels,battery energy storage system,and household appliances.Model-free DRL algorithms can efficiently handle the difficulty of energy system modeling and uncertainty of PV generation.However,discretecontinuous hybrid action space of the considered home energy system challenges existing DRL algorithms for either discrete actions or continuous actions.Thus,a mixed deep reinforcement learning(MDRL)algorithm is proposed,which integrates deep Q-learning(DQL)algorithm and deep deterministic policy gradient(DDPG)algorithm.The DQL algorithm deals with discrete actions,while the DDPG algorithm handles continuous actions.The MDRL algorithm learns optimal strategy by trialand-error interactions with the environment.However,unsafe actions,which violate system constraints,can give rise to great cost.To handle such problem,a safe-MDRL algorithm is further proposed.Simulation studies demonstrate that the proposed MDRL algorithm can efficiently handle the challenge from discrete-continuous hybrid action space for home energy management.The proposed MDRL algorithm reduces the operation cost while maintaining the human thermal comfort by comparing with benchmark algorithms on the test dataset.Moreover,the safe-MDRL algorithm greatly reduces the loss of thermal comfort in the learning stage by the proposed MDRL algorithm. 展开更多
关键词 Demand response deep reinforcement learning discrete-continuous action space home energy management safe reinforcement learning
原文传递
Improved Proximal Policy Optimization Algorithm for Sequential Security-constrained Optimal Power Flow Based on Expert Knowledge and Safety Layer 被引量:3
5
作者 Yanbo Chen Qintao Du +2 位作者 Honghai Liu Liangcheng Cheng Muhammad Shahzad Younis 《Journal of Modern Power Systems and Clean Energy》 SCIE EI CSCD 2024年第3期742-753,共12页
In recent years,reinforcement learning(RL)has emerged as a solution for model-free dynamic programming problem that cannot be effectively solved by traditional optimization methods.It has gradually been applied in the... In recent years,reinforcement learning(RL)has emerged as a solution for model-free dynamic programming problem that cannot be effectively solved by traditional optimization methods.It has gradually been applied in the fields such as economic dispatch of power systems due to its strong selflearning and self-optimizing capabilities.However,existing economic scheduling methods based on RL ignore security risks that the agent may bring during exploration,which poses a risk of issuing instructions that threaten the safe operation of power system.Therefore,we propose an improved proximal policy optimization algorithm for sequential security-constrained optimal power flow(SCOPF)based on expert knowledge and safety layer to determine active power dispatch strategy,voltage optimization scheme of the units,and charging/discharging dispatch of energy storage systems.The expert experience is introduced to improve the ability to enforce constraints such as power balance in training process while guiding agent to effectively improve the utilization rate of renewable energy.Additionally,to avoid line overload,we add a safety layer at the end of the policy network by introducing transmission constraints to avoid dangerous actions and tackle sequential SCOPF problem.Simulation results on an improved IEEE 118-bus system verify the effectiveness of the proposed algorithm. 展开更多
关键词 Sequential security-constrained optimal power flow(SCOPF) expert experience safety layer renewable energy safe reinforcement learning
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部