期刊文献+

多智能体对手建模及其真实模型的确定 被引量:2

Multi-agent opponent modeling and true model identification
原文传递
导出
摘要 针对如何在竞争环境中更好地预测对手行为,并正确制定自身相应的对策进行研究.利用交互式动态影响图对环境中的对手智能体进行建模,并结合贝叶斯网络提出了一种判断对手真实模型的方法.首先,将对手智能体的候选模型保存在模型节点中并实时推理和更新对手的模型、信度和动作;然后,在每次交互中记录下观察到的对手的动作序列,以此作为训练动态贝叶斯网络的集合,得到网络参数后重新计算候选模型的权重,从而判断出对手的真实模型;最后,通过多智能体老虎问题和无人机侦查问题进行实验,并从对手候选模型的权重和我方智能体的收益值两方面验证了算法的有效性. How to better predict the behavior of the opponent in a competitive environment,in order to make one′s own corresponding strategy correctly was studied.The opponent agents were modeled using interactive dynamic influence diagram in the environment and a method was proposed to identify the true model of the opponent based on Bayesian network.First,the candidate models of the opponent were set in the model node,and models,beliefs and actions of the opponent were inferred and updated real-timely.Then in every interaction,the observed action sequences of the opponent were recorded as the training set of dynamic Bayesian network.The weights of the candidate models were recalculated using the parameters of the network so as to identify the true model of the opponent.Experiments on multi-agent tiger problem and unmanned aerial vehicle reconnaissance problems verify the effectiveness of this method from two aspects which are weights of the opponent′s candidate models and profits of our own agent.
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2015年第10期48-52,共5页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金 国家自然科学基金资助项目(60975052 61375070) 福建省重大科技资助项目(2011H6027)
关键词 多智能体 对手建模 交互式动态影响图 动态贝叶斯网络 策略 multi-agent opponent modeling interactive dynamic influence diagram dynamic Bayesian network strategy
  • 相关文献

参考文献12

  • 1Russell S J,Norvig P.Artificial intelligence[M].3rd Edition.Upper Saddle River:Prentice Hall,2009.
  • 2Ekmekci O,Sirin V.Learning strategies for opponent modeling in poker[C]∥Proc of Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence.Bellevue:AAAI Press,2013:6-12.
  • 3Ganzfried S,Sandholm T.Game theory-based opponent modeling in large imperfect-information games[C]∥Proc of the 10th International Conference on Autonomous agents and Multiagent Systems.Taipei:IFAAMAS,2011:533-540.
  • 4Mescheder D,Tuyls K,Kaisers M.Opponent modeling with POMDPs[C]∥Proc of the 23rd BelgiumNetherlands Conference on Artificial Intelligence.Gent:BNVKI&SIKS,2011:152-159.
  • 5Zeng Y F,Doshi P.Model identification in interactive influence diagrams using mutual information[J].Web Intelligence and Agent Systems,2010,8(3):313-327.
  • 6顿文力,孟庆春,宋长虹,张艳.基于换位原理的对手建模模型[J].中国海洋大学学报(自然科学版),2004,34(1):109-114. 被引量:5
  • 7王磊,孙增圻.基于行为的多机器人对手意图识别二次估计方法[J].清华大学学报(自然科学版),2005,45(10):1421-1424. 被引量:7
  • 8李岩,曹琳,孙雷,刘景泰.竞争型网络机器人体系结构研究[J].机器人,2013,35(4):462-469. 被引量:5
  • 9Doshi P,Zeng Y F,Chen Q Y.Graphical models for interactive pomdps:representations and solutions[J].Journal of Autonomous Agents and Multi-agent Systems,2009,18(3):376-416.
  • 10Zeng Y F,Doshi P.Exploiting model equivalences for solving interactive dynamic influence diagrams[J].Journal of Artificial Intelligence Research,2012,43:211-255.

二级参考文献53

  • 1庄晓东,孟庆春,魏天滨,王旭柱,谭锐,李筱菁.Robot path planning in dynamic environment based on reinforcement learning[J].Journal of Harbin Institute of Technology(New Series),2001,8(3):253-255. 被引量:3
  • 2刘景泰,孙雷,陈涛,黄兴博,赵春颖.竞争型遥操作机器人系统的研究[J].机器人,2005,27(1):68-72. 被引量:5
  • 3王磊,孙增圻.基于行为的多机器人对手意图识别二次估计方法[J].清华大学学报(自然科学版),2005,45(10):1421-1424. 被引量:7
  • 4郝丽娜,李庆赟,王丹,徐心和.竞争型遥操作机器人实验系统研究[J].东北大学学报(自然科学版),2006,27(3):264-267. 被引量:2
  • 5Suryadi D, Gmytrasiewiez P. Learning models of other agents using influence diagrams[C]//Proceedings of the Seventh International Conference on User Modeling. New York: Springer-Verlag, 1999: 223-232.
  • 6Koller D, Milch B. Multi-agent influence diagrams for representing and solving games[J]. Games and Economic Behavior, 2003, 45:181-221.
  • 7Gal Y, Pfeffer A. Networks of influence diagrams: a formalism for representing agents' beliefs and decision-making processes[J]. Journal of Artificial Intelligence Research, 2008, 33: 109-147.
  • 8Tatman J A, Shachter R D. Dynamic programming and influence diagrams[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1990, 20(2): 365- 379.
  • 9Doshi P, Zeng Y F, Chen Q Y. Graphical models for interactive POMDPs: representation and solutions [J]. Journal of Autonomous agents and Multi-agent Systems, 2009, 18(3): 376-416.
  • 10Zeng Y F, Doshi P, Chen Q Y. Approximate solutions of interactive dynamic influence diagrams using model clustering[C]//Proceeding of the Twenty-second Conference on Association for the Advancement of Artificial Intelligence. Vancouver: AAAI Press, 2007 : 782-787.

共引文献13

同被引文献43

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部