期刊文献+

强化学习在导弹制导中的应用 被引量:6

Application of Reinforcement Learning in Missile Guidance
在线阅读 下载PDF
导出
摘要 简述了强化学习的基本原理和特点 ,讨论了强化学习中评价函数的神经网络近似问题 ,重点分析了采用多神经网络近似评价函数的学习问题 ,实现了状态空间或任务的自动分解 ,提高了评价函数的推广能力 .网络的学习是离线进行 ,并作为反馈控制器在线应用 .并以A 学习为例 ,将强化学习应用于导弹的制导问题 ,仿真结果表明了强化学习在导弹制导或控制问题中的应用前景和有效性 . Principle and characteristic of reinforcement learning are outlined. The value function approximation of reinforcement learning with neural networks is studied, and the learning algorithm using modular neural networks to approximate the value function is emphatically analyzed, which decomposes the state space automatically and increases the generalizing ability of the neural networks. The neural networks are trained offline, and is used online as a feedback controller. The A learning algorithm is applied in the missile guidance problem, and the simulation results show the good performance and effectiveness of the application of reinforcement learning in those problems of missile guidance and control.
作者 周锐 陈宗基
出处 《控制理论与应用》 EI CAS CSCD 北大核心 2001年第5期748-750,共3页 Control Theory & Applications
基金 国家自然科学基金(6990 40 0 2 ) 国防预研基金 航天科技创新基金资助项目
关键词 神经网络 强化学习 微分对策 导弹制导 人工智能 neural networks reinforcement learning differential games missile guidance
  • 相关文献

参考文献5

  • 1阎平凡.再励学习——原理、算法及其在智能控制中的应用[J].信息与控制,1996,25(1):28-34. 被引量:30
  • 2[2]Xu B Z, Zhang B L and Wei G. Neural Networks and Its Applications[M]. Guangz hou: South China University of Technology Press,1994
  • 3[3]Watkins J C H and Dayan P. Technical note: Q-learning [J]. Machine Learning, 1992, 8(4):279-292
  • 4[4]Baird L. Residual algorithms:Reinforcement learming with function approximation [ A]. Proceedings of the Twelfth International Conference on Machining Learning [C], Morgan Kaufman Publishers, SanFrancisco, CA, 1995
  • 5[5]Jacobs R A and Jordan M I. Learning piecewise control strategies in a modular neural network architecture [J]. IEEE Transactions on Systems, Man, and Cybemetics, 1993, 23(2):337-345

二级参考文献6

  • 1Leslie Pack Kaelbling. Associative Reinforcement Learning: Functions in k-DNF[J] 1994,Machine Learning(3):279~298
  • 2Leslie Pack Kaelbling. Associative Reinforcement Learning: A Generate and Test Algorithm[J] 1994,Machine Learning(3):299~319
  • 3Leslie Pack Kaelbling. Associative reinforcement learning: Functions ink-DNF[J] 1994,Machine Learning(3):279~298
  • 4Ronald J. Williams. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning[J] 1992,Machine Learning(3-4):229~256
  • 5Christopher J.C.H. Watkins,Peter Dayan. Technical Note: Q-Learning[J] 1992,Machine Learning(3-4):279~292
  • 6Richard S. Sutton. Learning to predict by the methods of temporal differences[J] 1988,Machine Learning(1):9~44

共引文献29

同被引文献86

引证文献6

二级引证文献47

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部