期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
增强型深度强化学习方法应用于化工过程控制 被引量:1
1
作者 张佳鑫 董立春 《化工进展》 北大核心 2025年第10期5563-5569,共7页
深度强化学习(DRL)算法因其无须依赖历史数据和先验知识,仅通过环境与智能体的互动即可实现策略优化和自主学习,在工业过程控制领域表现出良好的应用前景。其中,基于双延迟深度确定性策略梯度(TD3)算法的控制策略可有效克服深度确定性... 深度强化学习(DRL)算法因其无须依赖历史数据和先验知识,仅通过环境与智能体的互动即可实现策略优化和自主学习,在工业过程控制领域表现出良好的应用前景。其中,基于双延迟深度确定性策略梯度(TD3)算法的控制策略可有效克服深度确定性策略梯度(DDPG)模型中Q值易被高估,导致次优策略和鲁棒性不佳的缺陷,成为目前最领先的基于深度强化学习的控制模型。然而,原始TD3方法在应用于具有较显著策略波动的工业过程控制时仍显示出局限性,特别是其Q值低估问题会导致模型控制性能不佳。为了解决这些限制,本文提出了一种适用于工业过程控制的增强型TD3控制模型(ETD3),该模型首先建立评估指标来判断行动者(Actor)网络参数的高估或低估情况,并根据评估结果调整输入到批评家(Critic)网络的损失函数。然后,通过替换原始TD3中的固定学习率为三角衰减周期学习率,以提升模型的训练收敛性和控制性能。本文最后通过将增强型TD3算法应用于工业天然气脱水过程的控制过程验证了其有效性。 展开更多
关键词 过程控制 深度强化学习 双延时深度确定性策略梯度 三角衰减周期
在线阅读 下载PDF
Efficient sensorimotor cues for training a glider to soar autonomously
2
作者 Siyuan ZHENG Jiachi ZHAO +2 位作者 Lifang ZENG Zhouhong WANG Jun LI 《Journal of Zhejiang University-SCIENCE A》 2026年第2期128-141,共14页
Migratory birds depend on the perception of atmospheric updraft for long-distance flight.To realize more efficient autonomous soaring in an unpowered glider,different strategies for using potential sensorimotor cues t... Migratory birds depend on the perception of atmospheric updraft for long-distance flight.To realize more efficient autonomous soaring in an unpowered glider,different strategies for using potential sensorimotor cues to achieve autonomous soaring efficiency were compared and optimized.A simulation framework of autonomous soaring for an unpowered glider was developed based on a reinforcement learning algorithm.The framework was composed of three models:an updraft environment model,the glider's dynamics and control model,and a reinforcement learning agent,which learns to harvest more energy in flight.Based on the simulation,effects of different combinations of 12 potential sensorimotor cues on soaring efficiency were studied.Firstly,the absence of one particular sensorimotor cue and the use of only a single valid cue in autonomous soaring were analyzed.The results showed that the vertical airflow velocity gradient(aw)and the wing-tip updraft velocity difference(τ)have advantages over the other cues.Secondly,strategies combining aw orτwith other cues were analyzed to achieve more effective autonomous soaring,and seven potentially effective combinations of sensorimotor cues were identified.The final results showed that,among the tested combinations,the combination of vertical airflow velocity(Vw)andτ,enables the most efficient autonomous soaring.This study identified a highly effective sensorimotor cue strategy to guide an intelligent glider to achieve long-distance autonomous soaring flight. 展开更多
关键词 Autonomous soaring Glider reinforcement learning twin delayed deep deterministic policy gradient(td3) Sensorimotor cues
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部