煤矿巷道支护装备的自动化与智能化水平较低,制约了煤矿巷道的成形效率,是造成“采掘失衡”的关键原因。为解决煤矿巷道支护装备自动化程度低、支护效率差的问题,针对一种集成悬臂式掘进机和多自由度机械臂的钻锚机器人,提出了一种基于...煤矿巷道支护装备的自动化与智能化水平较低,制约了煤矿巷道的成形效率,是造成“采掘失衡”的关键原因。为解决煤矿巷道支护装备自动化程度低、支护效率差的问题,针对一种集成悬臂式掘进机和多自由度机械臂的钻锚机器人,提出了一种基于深度强化学习的钻锚机器人机械臂路径规划方法。在虚拟环境中构建煤矿巷道环境,并建立机械臂与机身、煤壁以及支护钢带的碰撞检测模型,使用层次包围盒法在虚拟环境进行碰撞检测,形成煤矿巷道边界受限情况下的避障策略。在近端策略优化(Proximal Policy Optimization,PPO)算法的基础上结合多方面因素提出改进。考虑到多自由度机械臂状态空间输入长度不固定的情况,引入长短记忆神经网络(Long Short Term Memory,LSTM)的环境状态输入处理方法,可以提升算法对环境的适应能力。并且在奖惩稀疏的情况下引入了好奇心机制(Intrinsic Curiosity Module,ICM),通过给予内在奖励鼓励智能体更大程度地探索环境。基于奖惩机制建立智能体,根据钻锚机器人的运动特性定义其状态空间与动作空间,在同一场景下分别使用2种算法对智能体进行训练,综合奖励值、回合步数、Actor网络损失值、Critic网络损失值等指标进行对比分析,最后经过仿真消融实验测试对比。实验结果表明,在原始PPO算法不能完成任务的情况下,改进后的算法路径长度比同样能完成任务的PPO-ICM算法缩短了3.98%,所用时间缩短了25.6%。为进一步验证改进后算法的鲁棒性,设计多组实验,改进后的PPO算法均完成路径规划任务,路径终点与目标位置的距离误差在3.88 cm之内,锚杆与竖直方向夹角误差在3°以内,能够有效完成路径规划任务,提升煤矿巷道支护系统的自动化程度。结果验证了所提方法在煤矿井下巷道支护时锚孔位置多变的情况下钻锚机器人多自由度机械臂在路径规划的可行性与有效性。展开更多
为平抑微源半桥变流器串联星型结构微电网HCSY-MG(half-bridge converter series Y-connection micro-grids)并网系统中微源出力的波动,保证各相直流侧电压之和相等,与并网电流三相平衡,提出1种基于改进近端策略优化PPO(proximal policy...为平抑微源半桥变流器串联星型结构微电网HCSY-MG(half-bridge converter series Y-connection micro-grids)并网系统中微源出力的波动,保证各相直流侧电压之和相等,与并网电流三相平衡,提出1种基于改进近端策略优化PPO(proximal policy optimization)的分布式混合储能系统HESS(hybrid energy storage system)充、放电优化控制策略。在考虑HCSY-MG系统并网电流与分布式HESS特性的条件下,确定影响并网电流的主要系统变量,以及HESS接入系统的最佳拓扑结构。然后结合串联系统的特点,将分布式HESS的充、放电问题转换为深度强化学习的Markov决策过程。同时针对PPO算法中熵损失权重难以确定的问题,提出1种改进的PPO算法,兼顾智能体的收敛性和探索性。最后以某新能源发电基地的典型运行数据为算例,验证所提控制策略的可行性和有效性。展开更多
Edge computing has transformed smart grids by lowering latency,reducing network congestion,and enabling real-time decision-making.Nevertheless,devising an optimal task-offloading strategy remains challenging,as it mus...Edge computing has transformed smart grids by lowering latency,reducing network congestion,and enabling real-time decision-making.Nevertheless,devising an optimal task-offloading strategy remains challenging,as it must jointly minimise energy consumption and response time under fluctuating workloads and volatile network conditions.We cast the offloading problem as aMarkov Decision Process(MDP)and solve it with Deep Reinforcement Learning(DRL).Specifically,we present a three-tier architecture—end devices,edge nodes,and a cloud server—and enhance Proximal Policy Optimization(PPO)to learn adaptive,energy-aware policies.A Convolutional Neural Network(CNN)extracts high-level features from system states,enabling the agent to respond continually to changing conditions.Extensive simulations show that the proposed method reduces task latency and energy consumption far more than several baseline algorithms,thereby improving overall system performance.These results demonstrate the effectiveness and robustness of the framework for real-time task offloading in dynamic smart-grid environments.展开更多
文摘煤矿巷道支护装备的自动化与智能化水平较低,制约了煤矿巷道的成形效率,是造成“采掘失衡”的关键原因。为解决煤矿巷道支护装备自动化程度低、支护效率差的问题,针对一种集成悬臂式掘进机和多自由度机械臂的钻锚机器人,提出了一种基于深度强化学习的钻锚机器人机械臂路径规划方法。在虚拟环境中构建煤矿巷道环境,并建立机械臂与机身、煤壁以及支护钢带的碰撞检测模型,使用层次包围盒法在虚拟环境进行碰撞检测,形成煤矿巷道边界受限情况下的避障策略。在近端策略优化(Proximal Policy Optimization,PPO)算法的基础上结合多方面因素提出改进。考虑到多自由度机械臂状态空间输入长度不固定的情况,引入长短记忆神经网络(Long Short Term Memory,LSTM)的环境状态输入处理方法,可以提升算法对环境的适应能力。并且在奖惩稀疏的情况下引入了好奇心机制(Intrinsic Curiosity Module,ICM),通过给予内在奖励鼓励智能体更大程度地探索环境。基于奖惩机制建立智能体,根据钻锚机器人的运动特性定义其状态空间与动作空间,在同一场景下分别使用2种算法对智能体进行训练,综合奖励值、回合步数、Actor网络损失值、Critic网络损失值等指标进行对比分析,最后经过仿真消融实验测试对比。实验结果表明,在原始PPO算法不能完成任务的情况下,改进后的算法路径长度比同样能完成任务的PPO-ICM算法缩短了3.98%,所用时间缩短了25.6%。为进一步验证改进后算法的鲁棒性,设计多组实验,改进后的PPO算法均完成路径规划任务,路径终点与目标位置的距离误差在3.88 cm之内,锚杆与竖直方向夹角误差在3°以内,能够有效完成路径规划任务,提升煤矿巷道支护系统的自动化程度。结果验证了所提方法在煤矿井下巷道支护时锚孔位置多变的情况下钻锚机器人多自由度机械臂在路径规划的可行性与有效性。
文摘为平抑微源半桥变流器串联星型结构微电网HCSY-MG(half-bridge converter series Y-connection micro-grids)并网系统中微源出力的波动,保证各相直流侧电压之和相等,与并网电流三相平衡,提出1种基于改进近端策略优化PPO(proximal policy optimization)的分布式混合储能系统HESS(hybrid energy storage system)充、放电优化控制策略。在考虑HCSY-MG系统并网电流与分布式HESS特性的条件下,确定影响并网电流的主要系统变量,以及HESS接入系统的最佳拓扑结构。然后结合串联系统的特点,将分布式HESS的充、放电问题转换为深度强化学习的Markov决策过程。同时针对PPO算法中熵损失权重难以确定的问题,提出1种改进的PPO算法,兼顾智能体的收敛性和探索性。最后以某新能源发电基地的典型运行数据为算例,验证所提控制策略的可行性和有效性。
基金supported by the National Natural Science Foundation of China(Grant No.62103349)the Henan Province Science and Technology Research Project(Grant No.232102210104).
文摘Edge computing has transformed smart grids by lowering latency,reducing network congestion,and enabling real-time decision-making.Nevertheless,devising an optimal task-offloading strategy remains challenging,as it must jointly minimise energy consumption and response time under fluctuating workloads and volatile network conditions.We cast the offloading problem as aMarkov Decision Process(MDP)and solve it with Deep Reinforcement Learning(DRL).Specifically,we present a three-tier architecture—end devices,edge nodes,and a cloud server—and enhance Proximal Policy Optimization(PPO)to learn adaptive,energy-aware policies.A Convolutional Neural Network(CNN)extracts high-level features from system states,enabling the agent to respond continually to changing conditions.Extensive simulations show that the proposed method reduces task latency and energy consumption far more than several baseline algorithms,thereby improving overall system performance.These results demonstrate the effectiveness and robustness of the framework for real-time task offloading in dynamic smart-grid environments.