期刊文献+

场景图谱驱动目标搜索的多智能体强化学习 被引量:2

Multi-agent reinforcement learning for scene graph-driven target search
在线阅读 下载PDF
导出
摘要 针对强化学习在视觉语义导航任务中准确率低,导航效率不高,容错率太差,且部分只适用于单智能体等问题,提出一种基于场景先验的多智能体目标搜索算法。该算法利用强化学习,将单智能体系统拓展到多智能体系统上将场景图谱作为先验知识辅助智能体团队进行视觉探索,利用集中式训练分布式探索的多智能体强化学习的方法以大幅度提升智能体团队的准确率和工作效率。通过在AI2THOR中进行训练测试,并与其他算法进行对比证明此方法无论在目标搜索的准确率还是效率上都优先于其他算法。 To solve the problems of reinforcement learning in the visual semantic navigation task,such as low accuracy,low navigation efficiency,poor fault tolerance rate,and the suitability of only some problems for a single agent,we propose a multi-agent target search algorithm based on scene prior.This algorithm extends the single-agent system to a multi-agent system through reinforcement learning.It mainly includes two aspects:first,a scene atlas is used as prior knowledge to assist the agent team in visual exploration;second,the multi-agent reinforcement learning method of centralized training and distributed exploration is used to greatly improve the accuracy and work efficiency of the agent team.Training tests in AI2THOR and comparison with other algorithms prove that this method is superior to other algorithms in target search accuracy and efficiency.
作者 陆升阳 赵怀林 刘华平 LU Shengyang;ZHAO Huailin;LIU Huaping(School of electrical and Electronic Engineering,Shanghai Institute of Technology,Shanghai 201418,China;Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China)
出处 《智能系统学报》 CSCD 北大核心 2023年第1期207-215,共9页 CAAI Transactions on Intelligent Systems
基金 国家自然科学基金项目(U1613212)。
关键词 多智能体 强化学习 视觉语义导航 场景图谱 先验知识 分布式探索 集中式训练 目标搜索 multi-agent reinforcement learning visual semantic navigation scene graph prior knowledge distributed exploration centralized training target search
  • 相关文献

参考文献4

二级参考文献22

  • 1CHONGJIE Z, LESSER V, SHENOY P. A multi-Agent learning approach to resource sharing across computing clusters [R]. Computer Science Department, University of Massachusetts Computer Science Amherst UMass, UM-CS- 20084)35, 2008.
  • 2KO P C, LIN P C, YOU J A, et al. Multi-layer allocated learning based neural network for resource allocation optimization[ C]// Proceedings of the 9th Joint Conference on Information Sciences(JCIS 2006). Taibei, China, 2006 : 35-41.
  • 3TESAURO G. Online resource allocation using decompositional reinforcement learning [ C ]//Proceedings of AAAI 2005. Pittsburgh, USA, 2005: 886-891.
  • 4LI3TMAN M L, STONE P. Leading best-response strategies in repeated games [C]//The 17th Annual International Joint Conference on Artificial Intelligence Workshop on Economic Agents, Models, and Mechanism. Seattle, Washington, USA, 2001: 745-756.
  • 5HU J, WELLMAN M P. Multiagent reinforcement learning in stochastic games [OL]. Citeseer. ist. psu. edu/ hu99multiagent. Html, 1999.
  • 6BUSONIU L, De SCHUTTER B, BABUSKA R. Multiagent reinforcement learning with adaptive state focus [C]//Proceedinga of the 17th Belgium-Nethedands Conference on Artificial Intelligence. Brussels, Belgium, 2005: 35-42.
  • 7KOK J R, VLASSIS N. Collaborative muhiagent reinforcement learning by payoff propagation[J]. Journal of Machine Learning Research, 2006, 7: 1789-1828.
  • 8HU J, WELLMAN M P. Nash Q-learning for general-sum stochastic games [ J ]. Journal of Machine Learning Research, 2003, 4 : 1039-1069.
  • 9ALPAYDN E.机器学习导论[M].范明,等译.北京:北京工业出版社,2009:244-255.
  • 10LAGOUDAKIS M G, PARR R. Least-squares policy iteration [ J ]. Journal of Machine Learning Research, 2003 (4) : 1107-1149.

共引文献27

同被引文献28

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部