期刊文献+

分层强化学习在足球机器人中的应用 被引量:2

The Application of Hierarchical Reinforcement Learning to the Robot Football
在线阅读 下载PDF
导出
摘要 提出将MaxQ分层增强式学习应用于足球机器人进攻策略学习,提高了强化学习的性能。通过在Robocup中的应用和实验,得出基于MaxQ分层强化学习的效果要优于传统的强化学习方法。 In this paper, MaxQ reinforcement learning is applied in the robot football strategic learning, which has strengthened the performance of the enhanced learning. Through the application and experiment in the Robocup, the effects based on the MaxQ reinforcement learning is superior to the traditional reinforcement learning method.
出处 《微计算机信息》 北大核心 2008年第32期231-233,共3页 Control & Automation
关键词 强化学习 Q_learning算法 MaxQ算法 ROBOCUP Reinforcement learning Q_learning algorithm MaxQ algorithm Robocup
  • 相关文献

参考文献14

  • 1Piao Songhao, Hang Bingrong. Fast Reinforcement earning Appro-ach to Cooperative Behavior Acquisition in Multi-agent System.Proceedings of the 2002 IEEE/RSJ Intl. Conference on IntelligentRobots and Systems, Lausanne, Switzerland. 2002-10:871- 875
  • 2洪炳镕.机器人足球技术的发展战略[A].中国人工智能学会第9届全国学术年会论文集[C].2001.
  • 3STONE P , VELOSO M. Muhi2agent systems : a survey from a machine learning perspective [ R] . CMU CS technical re2 port , No. CMU - CS - 97 - 193.Server. Proc. of IROS Workshop on Robocup, 1996
  • 4Nobuo S, Akira H. A Muhiagent Reinforcement Learning Algorithm Using Extended Optimal Response. Proc. of the First International Joint Conference on Autonomous Agents & Multiagent Systems,Bologna, Italy, 2002-07:370- 377
  • 5Hu Junling, Michael W P. Muhiagent Reinforcement Learning: Theoretical Framework and an Algorithm. Proc.15th International Conf. on Machine Learning 1998:242- 250
  • 6Caroline C, Craig B. The Dynamics of Reinforcement Learning in Cooperative Muhiagent Systems. In Proc. Workshop on Multi-agent Learning, 1997:602- 608
  • 7刘金琨,尔联洁.多智能体技术应用综述[J].控制与决策,2001,16(2):133-140. 被引量:112
  • 8刘金琨,王树青.多智能体控制系统的设计与实现[J].控制理论与应用,1999,16(4):580-582. 被引量:28
  • 9G Cohen. Concurrent system to resolve real-time conflicts in multi-robot sytems [J].Engineering Application Artificial Intelligence. 1995,8(2):169-175
  • 10罗青,李智军,吕恬生.复杂环境中的多智能体强化学习[J].上海交通大学学报,2002,36(3):302-305. 被引量:9

二级参考文献22

  • 1梁彦刚,唐国金,雍恩米.基于HLA的导弹攻防仿真系统分析与设计[J].国防科技大学学报,2004,26(5):18-21. 被引量:20
  • 2李冰,申春林.模拟雷达图像信号源的原理与设计[J].微计算机信息,2005,21(1):97-98. 被引量:17
  • 3刘金琨,邓守强.高炉热状态预测专家系统的设计及实现方法[J].东北大学学报(自然科学版),1995,16(5):473-477. 被引量:5
  • 4[1]Watkins C.J. C. H. Learning from delayed rewards [D] . Cambridge Univ. , England. 1989.
  • 5[2]Sutton R.S.Learning to predict by the method of temporal difference [J] .Machine Learning , 1988, (3): 9-44.
  • 6[3]Peng J.& Williams R.Incremental multi-step Q-learning [J] .Machine Learning, 1996, (22): 283-290.
  • 7[4]Rummery G.A & Niranjan M.On-line Q-learning using connectionist systems [R] .CUED/F-INFENG/TR 166,Cambridge University, UK.1994.
  • 8[5]Bertsekas D.P.Dynamic programming: deterministic and stochastic models [M] .Prentice Hall, USA.1987.
  • 9[6]Sutton R.S.& Barto A.G.An introduction to reinforcement learning [M] .The MIT Press, USA.1998.
  • 10[5]Riedmiller M, Merke A, Meier D. Karlsruhe brainstormers- a reinforcement learning approach to robotic soccer[DB/OL].http://illwww.ira.uka.de/-riedml/.

共引文献173

同被引文献23

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部