期刊文献+

基于人工神经网络的并行强化学习自适应路径规划 被引量:7

Application of Parallel Reinforcement Learning Based on Artificial Neural Network to Adaptive Path Planning
在线阅读 下载PDF
导出
摘要 强化学习是通过对环境的反复试探建立起从环境状态到行为动作的映射。利用人工神经网络的反馈进行权值的调整,再与高学习效率的并行强化学习算法相结合,提出了基于人工神经网络的并行强化学习的应用方法,并通过实验仿真验证了迭代过程的收敛性和该方法的可行性,从而有效地完成了路径学习。 Reinforcement learning is an important class of learning techniques that learns to perform a certain task through trial and error interactions with an knowledge-poor environment.By combining artificial neural network with parallel reinforcement learning,an applicable method of parallel reinforcement learning algorithm based on artificial neural network is proposed.Experimental results show that the method is effective.
机构地区 西北工业大学
出处 《科学技术与工程》 2011年第4期756-759,共4页 Science Technology and Engineering
关键词 并行强化学习 BP神经网络 路径规划 Q学习 parallel reinforcement learning BP neural network path plan Q learning
  • 相关文献

参考文献7

  • 1Sutton R S,Barto A G.Reinforcement learning.[s.l.]:MIT Press,1998.
  • 2高阳,陈世福,陆鑫.强化学习研究综述[J].自动化学报,2004,30(1):86-100. 被引量:295
  • 3Weng Juyang.On developmental mental architectures.Neruocomputing,2007;70:2303-2323.
  • 4孟伟,韩学东.并行强化学习算法及其应用研究[J].计算机工程与应用,2009,45(34):25-28. 被引量:7
  • 5Watkins C J C H,Dayan P.Q-learning.Machine Learning,1994;8(3):279-292.
  • 6Fierro R,Lewis F L.Control of a nonholonomic mobile robot using neural networks.IEEE Transcation on Neural Networks.1998;9(4):589-600.
  • 7Yager R.On the dempster shafer framework and new combination rules.Information Sciences,1997;(41):93-137.

二级参考文献15

  • 1童亮,陆际联,龚建伟.一种快速强化学习方法研究[J].北京理工大学学报,2005,25(4):328-331. 被引量:4
  • 2Sutton R S,Barto A G.Reinforcement learning:An introduction[M]. Cambridge, MA: MIT Press, 1998.
  • 3Watkins C J C H,Dayan P.Q-leaming[J].Machine Learning, 1992,8 (3) : 279-292.
  • 4Kaelbling L P,Littman M L,Moore A W.Reinforcement learning:A survey[J].Journal of Artificial Intelligence Research, 1996,4:237-285.
  • 5Barto A G,Sutton R S,Brouwer P S.Associative search network:A reinforcement learning associative memory[J].Biological Cybernetics, 1981,40:201-211.
  • 6Ahmadabadi M N,Asadpour M.Expertness based cooperative Q- learning[J].IEEE Transactions on Systems,Man,and Cybernetics- part B : Cybernetics, 2002,32( 1 ) : 66-76.
  • 7Igarashi H.Motion planning of a mobile robot as a discrete optimization problem[C]//Proceedings of the IEEE International Symposium on Assembly and Task Planning,May 28-29 2601: 1-6.
  • 8毛俊杰,刘国栋.基于先验知识的改进强化学习及其在MAS中应用[J].计算机工程与应用,2008,44(24):156-158. 被引量:2
  • 9蒋国飞,吴沧浦.Q学习算法在库存控制中的应用[J].自动化学报,1999,25(2):236-241. 被引量:20
  • 10高阳,周志华,何佳洲,陈世福.基于Markov对策的多Agent强化学习模型及算法研究[J].计算机研究与发展,2000,37(3):257-263. 被引量:30

共引文献300

同被引文献73

  • 1高峰,周浩,杨卓宇.基于改进A^*算法的水面无人船全局路径规划[J].计算机应用研究,2020,37(S01):120-121. 被引量:6
  • 2王福斌,刘杰,陈至坤,李书杰,曾秀丽,刘阔.基于RBF神经网络参数优化的挖掘机器人运动轨迹仿真[J].中国工程机械学报,2009,7(4):379-382. 被引量:2
  • 3穆中林,鲁艺,任波,张斌.基于改进A^*算法的无人机航路规划方法研究[J].弹箭与制导学报,2007,27(1):297-300. 被引量:21
  • 4Yap C C,Lin C F,Chang K J. A brake strategy for an automatic parking system of vehicle[A].VPPC,2009.798-802.
  • 5Horii Masaki,Liu Kangzhi. Automatic parking benchmark problem:Experimental comparison of nonholonomic control methods[A].2007.608-612.
  • 6Isobe T,Tsutsumi S.10Gbps implementation of TLS/SSL accelerator on FPGA[C]//IEEE 18th International Workshop on Quality of Service,2010:1-6.
  • 7Ahmad Salman,Marcin Rogawski,Jens-Peter Kaps.Efficient hardware accelerator for IPSEC based on partial reconfiguration on Xilinx FPGAs[C]//In Proceedings of the International Conference on Reconfigurable Computing and FPGAs,2012:242-248.
  • 8He Junqi,Dai Huiya,Song Xueli.The combination stretching function technique with simulated annealing algorithm for global optimization[J].Optimization Methods and Software,2014,29(3):629-645.
  • 9Saber M Elsayed,Ruhul A Sarker,Daryl L Essam.A new genetic algorithm for solving optimization problems[J].Engineering Applications of Artificial Intelligence,2014(27):57-69.
  • 10GUO Tiantai,HONG Bo,KONG Ming,et al.Application of ant colony algorithm in plant leaves classification based on infrared spectroscopy[J].AIP Conference Proceedings,2014,1592(1):378-385.

引证文献7

二级引证文献51

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部