期刊文献+

增强学习中的直接策略搜索方法综述 被引量:8

A survey of direct policy search methods in reinforcement learning
在线阅读 下载PDF
导出
摘要 对增强学习中各种策略搜索算法进行了简单介绍,建立了策略梯度方法的理论框架,并且根据这个理论框架的指导,对一些现有的策略梯度算法进行了推广,讨论了近年来出现的提高策略梯度算法收敛速度的几种方法,对于非策略梯度搜索算法的最新进展进行了介绍,对进一步研究工作的方向进行了展望. The direct policy search methods in reinforcement learning are described, and the theoretic framework of policy gradient methods is presented. According to this framework, some current policy gradient algorithms are generalized. The new methods of speeding up the policy gradient algorithms are discussed. The new non-policy gradient search methods are also described. Finally, some future directions of research work are also given.
出处 《智能系统学报》 2007年第1期16-24,共9页 CAAI Transactions on Intelligent Systems
基金 国家自然科学基金资助项目(60234030 60303012).
关键词 增强学习 策略搜索 策略梯度 reinforcement learning policy search policy Gradient
  • 相关文献

参考文献36

  • 1[2]SUTTON R,BARTO A.Reinforcement learning,an introduetion[M].MIT Press,1998.
  • 2[3]SINGH S P.Learning to solve Markovian decision processes[D].University of Massachusetts,1994.
  • 3[4]ROY B V.Learning and value function approximation in complex decision processes[M].MIT Press,1998.
  • 4[5]WATKINS C.Learning from delayed rewards[D].Cambrideg:University of Cambridge,1989.
  • 5[6]HUMPHRYS M.Action selection methods using reinforcement learning[D].Cambrideg:University of Cambridge,1996.
  • 6[7]BERTSEKAS D P,TSITSIKLIS J N.Neuro-dynamic programming[M].Athena Scientific,Belmont,Mass.,1996.
  • 7[8]SUTTON R S,MCALLESTER D,SINGH S,et al.Policy gradient methods for reinforcement learning with function approximation[A].In:Advafices in Neural Information Processing Systems[C].Denver,USA,2000.
  • 8[9]BAIRD L C.Residual algorithms:reinforcement learning with function approximation[A].In:Proc.Of the 12#Int.Conf.on Machine Learning[C].San Francisco,1995.
  • 9[10]TSITSIKLIS J N,ROY V B.Feature-based methods for large scale dynamic programming[J].Machine Learning,1996(22):59-94.
  • 10[12]BAXTER J,BARTLETT P L.Infinite-horizon policygradient estimation[J].Journal of Artificial Intelligence Research,2001(15):319-350.

同被引文献56

引证文献8

二级引证文献34

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部