摘要
在未来的战场中,智能导弹将成为精确有效的打击武器,导弹智能化已成为一种主要的发展趋势。本文以传统的比例制导律为基础,提出基于强化学习的变比例系数制导算法。该算法以视线转率作为状态,依据脱靶量设计奖励函数,并设计离散化的行为空间,为导弹选择正确的制导指令。实验仿真验证了所提算法比传统的比例制导律拥有更好的制导精度,并使导弹拥有了自主决策能力。
As the intelligent missile being a major development trend,it is foreseeable that it will become a precise and effective strike weapon in the future battlefields.On the basis of the traditional proportional guidance law,this paper proposes a guidance algorithm based on reinforcement learning with variable proportional coefficient.Taking the line-of-sight rate as the state,this algorithm designs a discretized action space,as well as a reward function based on the miss distance,to determine the correct guidance command for the missile.The simulation results prove the algorithm possesses better guidance accuracy than the traditional proportional guidance law and endows the missile with the ability of autonomous decision-making.
作者
张秦浩
敖百强
张秦雪
ZHANG Qinhao;AO Baiqiang;ZHANG Qinxue(Beijing Institute of Electronic Engineering, Beijing 100854, China;College of Computer Science, North China Institute of Aerospace Engineering, Langfang 065000, China)
出处
《系统工程与电子技术》
EI
CSCD
北大核心
2020年第2期414-419,共6页
Systems Engineering and Electronics
基金
中国博士后科学基金(2017M620863)资助课题
关键词
比例制导
制导律
脱靶量
机动目标
强化学习
Q学习
时序差分算法
proportional guidance
guidance law
miss distance
maneuvering target
reinforcement learning
Q-learning
timing difference algorithm