摘要
分布式电-氢耦合系统(DPHCS)通过跨时段能量灵活转换与功率动态调节,能够显著提升配电网的新能源承载力与灵活调控能力。目前,鲜有研究针对DPHCS协同配电网开展中长期尺度运行优化的深入探索。对此,提出一种基于安全强化学习的DPHCS协同配电网中长期运行优化方法。首先,构建考虑DPHCS灵活调节特性的配电网中长期优化模型,并将优化任务描述为马尔可夫决策过程,以实现能量的跨时段高效转移,并优化系统的中长期运行成本。然后,针对分布式资源随机性导致配电网运行约束易被违反的问题,提出一种基于DistFlow方程的近端策略优化算法,通过设计专家知识安全层将非可行动作映射到安全域,从而保障配电网运行安全性。其次,针对智能体在长回合尺度训练过程中潮流计算耗时长、效率低的问题,提出一种基于洛朗级数展开的线性化潮流计算方法,以显著提高强化学习算法在长周期、高频次交互训练下的计算效率。最后,基于IEEE 33节点和IEEE 118节点系统进行仿真验证,结果表明所提方法可有效实现能量的跨季节时段转移,在确保配电网中长期尺度运行安全的前提下,平均运行成本降低约27%,兼具显著的安全性和经济性优势。
Distributed power-hydrogen coupled systems(DPHCS)can significantly enhance the renewable energy capacity and flexible control capabilities of distribution networks through cross-temporal energy conversion and dynamic power regulation.However,limited research has explored medium-and long-term operational optimization of DPHCS coordinated distribution networks.To address this gap,a medium-and long-term operational optimization method for distribution networks with coordination of DPHCS based on safe reinforcement learning is proposed.First,a medium-and long-term optimization model for distribution networks incorporating flexible regulation characteristics of DPHCS is established and formulated as a Markov decision process,enabling efficient cross-temporal energy transfer while optimizing both operational costs and long-term system performance.Next,to address the issue of operational constraints being violated due to the randomness of distributed resources,a DistFlow-based proximal policy optimization(DFPPO)algorithm is proposed.In this algorithm,a domain-knowledge safety layer is designed to map infeasible actions into the security region,ensuring the strict satisfaction of distribution network operational constraints.Furthermore,to solve the problem of high computational cost and low efficiency of power flow calculation during longhorizon training,a Laurent series-based linearized power flow method is introduced,significantly improving the computational efficiency of reinforcement learning in long-term and high-frequency interactions.Finally,the simulation on the IEEE 33-bus and IEEE 118-bus systems demonstrates that the proposed method can effectively enable the cross-seasonal energy transfer.The average operation cost can be reduced by approximately 27% while ensuring the operational safety over medium-to long-term scales,demonstrating significant advantages in both safety and economic efficiency.
作者
吴明贺
洪芦诚
王梓萩
袁晓东
朱进
侯胜任
WU Minghe;HONG Lucheng;WANG Ziqiu;YUAN Xiaodong;ZHU Jin;HOU Shengren(School of Electrical Engineering,Southeast University,Nanjing 210096,China;Jiangsu Provincial Key Laboratory of Smart Grid Technology&Equipment(Southeast University),Nanjing 210096,China;Electric Power Research Institute of State Grid Jiangsu Electric Power Co.,Ltd.,Nanjing 211103,China;Faculty of Electrical Engineering,Mathematics and Computer Science,Delft University of Technology,Delft 2628SK,Netherlands)
出处
《电力系统自动化》
北大核心
2026年第7期129-141,共13页
Automation of Electric Power Systems
基金
国家重点研发计划“氢能技术”重点专项资助项目(2024YFB4007400)。
关键词
电-氢耦合系统
配电网
运行优化
安全强化学习
专家知识安全层
安全域
潮流
power-hydrogen coupled system
distribution network
operational optimization
safe reinforcement learning
domainknowledge safety layer
security region
power flow