期刊文献+

随机博弈框架下的多agent强化学习方法综述 被引量:13

Survey of Multi-agent Reinforcement Learning in Markov Games
在线阅读 下载PDF
导出
摘要 多agent学习是在随机博弈的框架下,研究多个智能体间通过自学习掌握交互技巧的问题.单agent强化学习方法研究的成功,对策论本身牢固的数学基础以及在复杂任务环境中广阔的应用前景,使得多agent强化学习成为目前机器学习研究领域的一个重要课题.首先介绍了多agent系统随机博弈中基本概念的形式定义;然后介绍了随机博弈和重复博弈中学习算法的研究以及其他相关工作;最后结合近年来的发展,综述了多agent学习在电子商务、机器人以及军事等方面的应用研究,并介绍了仍存在的问题和未来的研究方向. The research on multi-agent reinforcement learning is to deal with the problem of play skill between agents, just with the concept of stochastic game. All the things of the success of single agent reinforcement learning, the mathematics basis of the game theory, and the potential applications in complex task environment make the multi-agent learning an important topic in the field of machine learning. The formal definition of basic concepts in stochastic game is given first, and then the algorithms of learning in stochastic game and repeated game are introduced. Last, the research on the applications of multi-agent learning is summarized, and the problems remaining unsolved together with the future work are concluded.
出处 《控制与决策》 EI CSCD 北大核心 2005年第10期1081-1090,共10页 Control and Decision
关键词 多AGENT系统 随机博弈 强化学习 Multi-agent system Stochastic game Reinforcement learning
  • 相关文献

参考文献65

  • 1Bowling M, Veloso M. Existence of Multiagent Equilibria with Limited Agents [J]. J of Artificial Intelligence Research, 2004, 22(2):353-384.
  • 2Michael L Littman. Friend or foe Q-Learning in General-sum Markov Games [A]. 18th lnt Conf on Machine Learning[C]. MA: Williams Colledge, 2001:322-328.
  • 3Bowling M, Veloso M. Convergence of Gradient Dynamics with a Learning Rate [J]. Artificial Intelligence, 2002, 136(2): 215-250.
  • 4Michael L Littman. Markov Games as a Framework for Multi-agent Reinforcement Learning[A].11th lnt Confon Machine Learning[C]. New Brunswick, 1994,157-163.
  • 5Hu J L, Michael P Wellman. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm[A]. 15th Int Conf on Machine Learning [C].Madison, 1998: 242-250.
  • 6Greenwald A, Keith Hall. Correlated Q-Learning[A].Proc of the 20th Int Conf on Machine Learning[C].Washington, 2003: 242-249.
  • 7高阳,周志华,何佳洲,陈世福.基于Markov对策的多Agent强化学习模型及算法研究[J].计算机研究与发展,2000,37(3):257-263. 被引量:30
  • 8刘德铭 黄振高.对策论及其应用[M].长沙:国防科技大学出版社,1994..
  • 9Yoav Shoham, Rob Powers, Trond Grenager. Multiagent Reinforcement Learning: A Critical Survey[R].California: Stanford University, 2003.
  • 10张曙光.中国经济和经济学家[M].成都,四川人民出版社,1999..

二级参考文献105

  • 1杨璐,洪家荣,黄梯云.用加强学习方法解决基于神经网络的时序实时建模问题[J].哈尔滨工业大学学报,1996,28(4):136-139. 被引量:2
  • 2阎平凡.再励学习——原理、算法及其在智能控制中的应用[J].信息与控制,1996,25(1):28-34. 被引量:30
  • 3Grinton C G. A testbed for investigating agent effectiveness in a multi-agent pursuit game [D]. Victoria,Australia:The University of Melbourne, 1996.
  • 4Benda M, Jagannathan V, Dodhiawalla R. On optimal cooperation of knowledge sources [R]. Technical Report, BCS-G2010-28. Seattle, USA: Boeing Computer Services, 1985.
  • 5Korf R E. A simple solution to pursuit games [A].The 11th International Workshop on Distributed Artificial Intelligence, Glen Arbor, USA, 1992.
  • 6Hespanha J, Prandini M, Sastry S. Probabilistic pursuit-evasion games: a one-step Nash approach [A].The 39th IEEE Conf on Decision and Control, Sydney,Australia, 2002.
  • 7Haynes T, Sen S. Evolving behavioral strategies in predators and prey [A]. Working Notes of the Adaptation and Learning in Multiagent Systems Workshop[C]. Berlin: Springer Verlag, 1995.
  • 8Ono N,Fukumoto K. Multiagent reinforcement learning: a modular approach [A]. The Second International Conference on Multi-Agent Systems, Kyoto, Japan,1996.
  • 9SUTTON R. Learning to predict by the methods of temporal difference [J]. Machine Learning, 1988,3( 1 ) :9 - 44.
  • 10RIBEIRO C. Embedding a priori knowledge in reinforcement learning [ J]. J of Intelligent and Robotic Systems, 1998,21 ( 1 ) :51 - 71.

共引文献167

同被引文献112

引证文献13

二级引证文献55

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部