期刊文献+

面向多目标参数整定的协同深度强化学习方法 被引量:2

Collaborative Deep Reinforcement Learning Method for Multi-Objective Parameter Tuning
在线阅读 下载PDF
导出
摘要 多目标控制参数联合优化整定是自动化系统保持高效、稳定运行的关键问题,强化学习常用于建立自动化调参智能体,代替人工完成参数整定.针对现有方法使用固定权重将多个优化目标线性组合为单目标,训练具有固定调参知识的单智能体模型,导致实际目标关系受环境影响与先验不符时,智能体无法感知并做出适应性决策调整,限制参数整定效果的问题,提出一种面向多目标参数整定的协同深度强化学习方法.该方法利用离线仿真学习目标整定知识建立多个Double-DQN智能体,在线建立整定效果反馈,感知目标实际关系并调整智能体协同策略,实现有效的多目标参数整定.列车自动驾驶参数整定实验结果表明,方法对停车误差、舒适度两个目标整定效果良好,能自适应不同车轨性能且可持续优化,实用价值大. The joint optimization and tuning of multi-objective control parameters is a key issue for the automation system to maintain efficient and stable operation. Reinforcement learning is often used to establish an automated parameter adjustment agent which can replace experts to complete parameter tuning. Existing methods use fixed weights to linearly combine multiple optimization objectives into a single objective and train a single agent model with fixed tuning knowledge, making the actual objective relationship do not match the initialization, the agent can’t perceive and make adaptive decision-making adjustments, limiting the effect of parameter tuning. To solve the problem, a collaborative deep reinforcement learning method was proposed for multi-objective parameter tuning. Firstly, an offline simulation was used to learn objective tuning knowledge and to establish multiple Double-DQN agents. Then tuning effect feedback was established online to perceive the actual relationship between the objectives and adjust the agents’ coordination strategy to achieve effective multi-objective parameter tuning. The experimental results of automatic train operation parameter tuning show that the proposed method presents better effect on the two goals of parking error and comfort, adapting to different track performance and continue optimization, processing great practical value.
作者 罗森林 魏继勋 刘晓双 潘丽敏 LUO Senlin;WEI Jixun;LIU Xiaoshuang;PAN Limin(School of Information and Electronics,Beijing Institute of Technology,Beijing 100081,China)
出处 《北京理工大学学报》 EI CAS CSCD 北大核心 2022年第9期969-975,共7页 Transactions of Beijing Institute of Technology
关键词 参数整定 多目标 强化学习 自动化系统 协同 parameter tuning multi-objective reinforcement learning automation system coordination
  • 相关文献

参考文献6

二级参考文献107

  • 1田浩彬,林建平,刘瑞同,许永超.汽车车身轻量化及其相关成形技术综述[J].汽车工程,2005,27(3):381-384. 被引量:58
  • 2侯忠生.无模型自适应控制的现状与展望[J].控制理论与应用,2006,23(4):586-592. 被引量:139
  • 3Zhang L J, Zhang J, Cai H. Services Computing. Beijing: Springer and Tsinghua University Press, 2007.
  • 4Li Y, Lin C. QoS-aware service composition for workflow- based data-intensive applieations//Proceedings of the 2011 IEEE International Conference on Web Services (ICWS 2011). Washington, USA, 2011:452-459.
  • 5Boyd S, Vandenberghe L. Convex Optimization. Cambridge, UK: Cambridge University Press, 2004.
  • 6Cormen T H, Leiserson C E, Rivest R L, Stein C. Introduction to Algorithms. MIT, USA: MIT Press, 2005.
  • 7Wada H, Champrasert P, Suzuki J, Oha K. Multiobjectrve optimization of SLA-aware service composition//Proceedings of the IEEE Congress on Services. Honolulu, USA, 2008: 368-375.
  • 8Zhou Z, Liu F, Jin H, et al. On arbitrating the power- performance tradeoff in SaaS clouds//Proceedings of the IEEE INFOCOM 2013. Turin, Italy, 2013:872-880.
  • 9Leitner P, Hummer W, Satzger B, et al. Cost-efficient and application SLA-aware client side request scheduling in an infrastructure-as-a-service cloud//Proceedings of the 2012 IEEE 5th International Conference on Cloud Computing (CLOUD 2012). Honolulu, USA, 2012:213-220.
  • 10Kong X, Lin C, Jiang Y, et al. Efficient dynamic task scheduling in virtualized data centers with fuzzy prediction. Journal of Network and Computer Applications, 2011, 34(4) : 1068-1077.

共引文献113

同被引文献17

引证文献2

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部