摘要
研究了一种基于自适应启发评价(AHC)强化学习的移动机器人路径跟踪控制方法.AHC的评价单元(ACE)采用多层前向神经网络来实现,将TD(λ)算法和梯度下降法相结合来更新神经网络的权值.AHC的动作选择单元(ASE)由遗传算法优化的模糊推理系统(FIS)构成.ACE网络的输出构成二次强化信号,用于指导ASE的学习.最后将所提出的算法应用于移动机器人的行为学习,较好地解决了机器人的复杂路径跟踪问题.
The control policy of robot path-tracking based on adaptive heuristic ctritic(AHC) reinforcement learning is researched. The adaptive critic element(ACE)of AHC is composed of a multi-layer feedforward network. TD(2) algorithm and gradient descent algorithm are integrated, which is used to update the weights of network. The output of the ACE generates the secondary reinforcement signal which can direct the learning of the action select element (ASE). ASE can be implemented by the fuzzy inference system (FIS) which is optimized by using the genetic algorithms. Finally, the method is used for learning the robot behavior. The experiment shows that the scheme can effectively solve the problem of the robot path-tracking.
出处
《控制与决策》
EI
CSCD
北大核心
2009年第4期532-536,541,共6页
Control and Decision
基金
国家自然科学基金项目(60475036)