摘要
基于CMAC(cerebella model articulation controller)提出一种动态强化学习方法(dynamic cerebellamodel articulation controller-advantage learning,DCMAC-AL)。该方法利用advantage(λ)learning计算状态-动作函数,强化不同动作的值函数差异,以避免动作抖动;然后在CMAC函数拟合基础上,利用Bellman误差动态添加特征值,提高CMAC函数拟合的自适应性。同时,在RoboCup仿真平台上对多智能体防守任务(takeaway)进行建模,利用新提出的算法进行学习实验。实验结果表明,DCMAC-AL比采用CMAC的advantage(λ)learning方法有更好的学习效果。
An improved algorithm based on CMAC (cerebella model articulation controller) and named DCMAC-AL is proposed. It uses advantage (λ) learning to calculate the state-action function,emphasizes the differences among action values and shuns action oscillation. It creates novel features based on Bellman error to improve the adaption of CMAC. Besides,it provides a mathematic model for takeaway in RoboCup Soccer Simulation and experiment with DCMAC-AL. The results demonstrate that DCMAC-AL outperforms advantage(λ) learning in regard to learning effort.
出处
《广西师范大学学报(自然科学版)》
CAS
北大核心
2010年第3期99-103,共5页
Journal of Guangxi Normal University:Natural Science Edition
基金
国家自然科学基金资助项目(60702056)
江苏省研究生创新项目(ZX09B2042)