煤矿巷道支护装备的自动化与智能化水平较低,制约了煤矿巷道的成形效率,是造成“采掘失衡”的关键原因。为解决煤矿巷道支护装备自动化程度低、支护效率差的问题,针对一种集成悬臂式掘进机和多自由度机械臂的钻锚机器人,提出了一种基于...煤矿巷道支护装备的自动化与智能化水平较低,制约了煤矿巷道的成形效率,是造成“采掘失衡”的关键原因。为解决煤矿巷道支护装备自动化程度低、支护效率差的问题,针对一种集成悬臂式掘进机和多自由度机械臂的钻锚机器人,提出了一种基于深度强化学习的钻锚机器人机械臂路径规划方法。在虚拟环境中构建煤矿巷道环境,并建立机械臂与机身、煤壁以及支护钢带的碰撞检测模型,使用层次包围盒法在虚拟环境进行碰撞检测,形成煤矿巷道边界受限情况下的避障策略。在近端策略优化(Proximal Policy Optimization,PPO)算法的基础上结合多方面因素提出改进。考虑到多自由度机械臂状态空间输入长度不固定的情况,引入长短记忆神经网络(Long Short Term Memory,LSTM)的环境状态输入处理方法,可以提升算法对环境的适应能力。并且在奖惩稀疏的情况下引入了好奇心机制(Intrinsic Curiosity Module,ICM),通过给予内在奖励鼓励智能体更大程度地探索环境。基于奖惩机制建立智能体,根据钻锚机器人的运动特性定义其状态空间与动作空间,在同一场景下分别使用2种算法对智能体进行训练,综合奖励值、回合步数、Actor网络损失值、Critic网络损失值等指标进行对比分析,最后经过仿真消融实验测试对比。实验结果表明,在原始PPO算法不能完成任务的情况下,改进后的算法路径长度比同样能完成任务的PPO-ICM算法缩短了3.98%,所用时间缩短了25.6%。为进一步验证改进后算法的鲁棒性,设计多组实验,改进后的PPO算法均完成路径规划任务,路径终点与目标位置的距离误差在3.88 cm之内,锚杆与竖直方向夹角误差在3°以内,能够有效完成路径规划任务,提升煤矿巷道支护系统的自动化程度。结果验证了所提方法在煤矿井下巷道支护时锚孔位置多变的情况下钻锚机器人多自由度机械臂在路径规划的可行性与有效性。展开更多
Edge computing has transformed smart grids by lowering latency,reducing network congestion,and enabling real-time decision-making.Nevertheless,devising an optimal task-offloading strategy remains challenging,as it mus...Edge computing has transformed smart grids by lowering latency,reducing network congestion,and enabling real-time decision-making.Nevertheless,devising an optimal task-offloading strategy remains challenging,as it must jointly minimise energy consumption and response time under fluctuating workloads and volatile network conditions.We cast the offloading problem as aMarkov Decision Process(MDP)and solve it with Deep Reinforcement Learning(DRL).Specifically,we present a three-tier architecture—end devices,edge nodes,and a cloud server—and enhance Proximal Policy Optimization(PPO)to learn adaptive,energy-aware policies.A Convolutional Neural Network(CNN)extracts high-level features from system states,enabling the agent to respond continually to changing conditions.Extensive simulations show that the proposed method reduces task latency and energy consumption far more than several baseline algorithms,thereby improving overall system performance.These results demonstrate the effectiveness and robustness of the framework for real-time task offloading in dynamic smart-grid environments.展开更多
文摘煤矿巷道支护装备的自动化与智能化水平较低,制约了煤矿巷道的成形效率,是造成“采掘失衡”的关键原因。为解决煤矿巷道支护装备自动化程度低、支护效率差的问题,针对一种集成悬臂式掘进机和多自由度机械臂的钻锚机器人,提出了一种基于深度强化学习的钻锚机器人机械臂路径规划方法。在虚拟环境中构建煤矿巷道环境,并建立机械臂与机身、煤壁以及支护钢带的碰撞检测模型,使用层次包围盒法在虚拟环境进行碰撞检测,形成煤矿巷道边界受限情况下的避障策略。在近端策略优化(Proximal Policy Optimization,PPO)算法的基础上结合多方面因素提出改进。考虑到多自由度机械臂状态空间输入长度不固定的情况,引入长短记忆神经网络(Long Short Term Memory,LSTM)的环境状态输入处理方法,可以提升算法对环境的适应能力。并且在奖惩稀疏的情况下引入了好奇心机制(Intrinsic Curiosity Module,ICM),通过给予内在奖励鼓励智能体更大程度地探索环境。基于奖惩机制建立智能体,根据钻锚机器人的运动特性定义其状态空间与动作空间,在同一场景下分别使用2种算法对智能体进行训练,综合奖励值、回合步数、Actor网络损失值、Critic网络损失值等指标进行对比分析,最后经过仿真消融实验测试对比。实验结果表明,在原始PPO算法不能完成任务的情况下,改进后的算法路径长度比同样能完成任务的PPO-ICM算法缩短了3.98%,所用时间缩短了25.6%。为进一步验证改进后算法的鲁棒性,设计多组实验,改进后的PPO算法均完成路径规划任务,路径终点与目标位置的距离误差在3.88 cm之内,锚杆与竖直方向夹角误差在3°以内,能够有效完成路径规划任务,提升煤矿巷道支护系统的自动化程度。结果验证了所提方法在煤矿井下巷道支护时锚孔位置多变的情况下钻锚机器人多自由度机械臂在路径规划的可行性与有效性。
基金supported by the National Natural Science Foundation of China(Grant No.62103349)the Henan Province Science and Technology Research Project(Grant No.232102210104).
文摘Edge computing has transformed smart grids by lowering latency,reducing network congestion,and enabling real-time decision-making.Nevertheless,devising an optimal task-offloading strategy remains challenging,as it must jointly minimise energy consumption and response time under fluctuating workloads and volatile network conditions.We cast the offloading problem as aMarkov Decision Process(MDP)and solve it with Deep Reinforcement Learning(DRL).Specifically,we present a three-tier architecture—end devices,edge nodes,and a cloud server—and enhance Proximal Policy Optimization(PPO)to learn adaptive,energy-aware policies.A Convolutional Neural Network(CNN)extracts high-level features from system states,enabling the agent to respond continually to changing conditions.Extensive simulations show that the proposed method reduces task latency and energy consumption far more than several baseline algorithms,thereby improving overall system performance.These results demonstrate the effectiveness and robustness of the framework for real-time task offloading in dynamic smart-grid environments.