期刊文献+

基于独立学习的多智能体协作决策 被引量:8

Multiagent cooperative decision making based on independent learning
在线阅读 下载PDF
导出
摘要 联合学习模式是实现多智能体协作决策的有效方法 ,但是当智能体信息不完备时 ,这一方法难以适用。为此 ,在智能体独立学习的基础上提出一种多智能体协作决策方法。以网格对策为例 。 Although joint learning is an efficient method to implement multiagent cooperative decision, it is unsuccessful when agent has imperfect information. The method of agents′ independent learning which acts as the base of multiagent cooperative decision is put forward. The experiment of grid games shows the efficiency.
出处 《控制与决策》 EI CSCD 北大核心 2002年第1期29-32,共4页 Control and Decision
关键词 独立学习 多智能体协作决策 智能控制 学习算法 联合学习模式 multiagant reinforcement learnning independent learning Markov cooprative decision process
  • 相关文献

参考文献10

  • 1[1]M L Littman. Markov games as framework for multi-agent reinforcement learning[A]. Proc of the 11th Int Conf on Machine Learning[C]. San Francisco: Morgan Kaufmann,1994.157-163.
  • 2[2]J Hu, M P Wellman. Multiagent reinforcement learning: Theoretical framework and an algorithm[A]. Proc of the 15th Int Conf on Machine Learning[C]. Morgan Kaufmann,1998.242-250.
  • 3[3]C Claus, C Boutilier. The dynamics of reinforcement learning in cooperative multiagent systems[A]. Proc of the 15th National Conf on Artificial Intelligence[C]. Cambridge MIT Press,1997.235-262.
  • 4[4]D H Wolpert, K Wheeler, K Tumer, et al. General principles of learning-based multi-agent systems[A]. Proc of the Third Int Conf of Autonomous Agents[C]. Seattle,1999.77-83.
  • 5[5]J A Boyan, M L Littman. Packet routing in dynamically changing networks: A reinforcement learningapproach[J]. Adv in Neur Inform Proc Syst,1993,6:671-678.
  • 6[6]R H Crites, A G Barto. Elevator group control using multiple reinforcement learning agents[J]. Machine Learning,1998,33:235-262.
  • 7[7]J Schneider, W K Wong, A Moore, et al. Distributed value functions[A]. Proc of the 16th Int Conf on Machine Learning[C]. San Francisco: Morgan Kaufmann,1999.371-378.
  • 8[8]C Watkins. Q-learning[J]. Machine Learnning,1992,8:279-292.
  • 9[9]C Watkins. Learning from delayed rewards[D]. Cambridge: Cambridge University,1989.
  • 10[10]A G Barto, R S Sutton, C Watkins. C Learning and sequential decision making[A]. Learning and Computational Newroscience: Foundation of Addaptive Networks[C]. Cambridge MIT Press,1990.539-602.

同被引文献65

引证文献8

二级引证文献53

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部