期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Off-Policy Reinforcement Learning with Gaussian Processes 被引量:2
1
作者 Girish Chowdhary Miao Liu +3 位作者 Robert Grande Thomas Walsh Jonathan How lawrence carin 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI 2014年第3期227-238,共12页
An off-policy Bayesian nonparameteric approximate reinforcement learning framework,termed as GPQ,that employs a Gaussian processes(GP)model of the value(Q)function is presented in both the batch and online settings.Su... An off-policy Bayesian nonparameteric approximate reinforcement learning framework,termed as GPQ,that employs a Gaussian processes(GP)model of the value(Q)function is presented in both the batch and online settings.Sufficient conditions on GP hyperparameter selection are established to guarantee convergence of off-policy GPQ in the batch setting,and theoretical and practical extensions are provided for the online case.Empirical results demonstrate GPQ has competitive learning speed in addition to its convergence guarantees and its ability to automatically choose its own bases locations. 展开更多
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部