期刊文献+

基于ILCS的多机器人强化学习策略

Multi-robot Reinforcement Learning Strategy Based on ILCS
原文传递
导出
摘要 提出了一种基于改进学习分类器的多机器人强化学习方法。增强学习使机器人能发现一组用于指导其强化学习行为的规则。遗传算法则在现有的规则中淘汰掉较差的,并利用较优的种群规则产生出新的学习规则。规则合并能提高多机器人的并行强化学习效率,使多个机器人自主地学习到相互协作的最优策略。算法的分析和仿真表明,将改进的学习分类器用于多机器人的强化学习是有效的。 This paper proposes a multi-robots reinforcement learning method based on improved learning classifier system.The enhanced learning enables robots to discover a group rules for guiding their reinforcement leaning behavior.Genetic algorithm could eliminate worse ones in the existing rules and produce new learning rules with the superior population rules.The merged rules can increase multi-robots' learning efficiency in parallel,thus the multi-robots could learn to collaborate with the best strategy.The algorithm analysis and the simulation indicate that the improved learning classifier system used in the multi-robot reinforcement learning is feasible and effective.
出处 《通信技术》 2010年第4期220-222,共3页 Communications Technology
基金 国家自然科学基金资助项目(批准号:60705020) 面向移动机器人环境感知的主动学习研究
关键词 强化学习 多机器人 改进学习分类器 遗传算法 reinforcement learning Multi-robot improved learning classifier system genetic algorithm
  • 相关文献

参考文献7

二级参考文献22

  • 1江雷.基于并行遗传算法的弹性TSP研究[J].微电子学与计算机,2005,22(8):130-133. 被引量:10
  • 2穆艳玲,李学武,高润泉.遗传算法解TSP问题的并行实现[J].北京联合大学学报,2006,20(2):40-43. 被引量:5
  • 3Lin L J,Proc AAAI'91,1991年,781页
  • 4Lin L J,From Animals to Animates:Int Conference on Simulation of Adaptive Behavior,1991年
  • 5EXCELENTE-TOLEDO C B, JENNINGS N R. Using reinforcement learning to coordinate better[J]. Computational Intelligence, 2005, 21(3): 217 - 245
  • 6BARTO A G, MAHADEVAN S. Recent advances in hierarchical reinforcement learning[J]. Discrete Event Dynamic Systems: Theory and Applications, 2003, 13(4): 41 - 77.
  • 7SUTTON R S, PRECUP D, SINGH S P. Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning[J]. Artificial Intelligence, 1999, 112(1): 181 - 211.
  • 8PARRR. Hierarchical control and learning for markov decision processes[D]. Berkeley: University of California, 1998.
  • 9DIETTERICH T G. Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. J of Artificial Intelligence Research, 2000, 13(1): 227 - 303.
  • 10PRECUP D. Temporal abstraction in reinforcement learning[D]. Amherst: University of Massachusetts, 2000.

共引文献325

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部