摘要
作为解决序贯决策的机器学习方法,强化学习采取持续的"交互-试错"机制,实现智能体(Agent)与环境的不断交互,从而学得完成任务的最优策略,契合了人类提升智能的行为决策方式。知识作为一种包含了经验、价值观、认知规律以及专家见解等要素的结构化信息,应用于强化学习可以有效提高Agent的学习效率,降低学习难度。鉴于此,本文以强化学习的基本理论为起点,对深度强化学习以及基于知识的深度强化学习研究成果进行了系统性的总结与梳理。
As an important method to solve sequential decision problems, reinforcement learning adopts a mechanism of “trial and error” to interact with the environment, in order to learn the policy of the task. Know-ledge, as a kind of structured information, which contains the elements of experience, values, cognitive rules and expert opinions, can be effectively used to improve the learning efficiency of reinforcement learning. This paper takes the basic theory of reinforcement learning as a starting point, and systematically summarizes the deep reinforcement learning and knowledge-based reinforcement learning.
出处
《系统工程与电子技术》
EI
CSCD
北大核心
2017年第11期2603-2613,共11页
Systems Engineering and Electronics
基金
总装备部预研基金(9140A06020315JB25081)
中国博士后科学基金第八批特别项目(2015T81081)
中国博士后科学基金第60批面上项目(2016M6029174)
江苏省自然科学基金青年基金面上项目(BK20140075)资助课题
关键词
深度强化学习
知识
探索策略
逆强化学习
deep reinforcement learning
knowledge
exploration strategy
inverse reinforcement learning