期刊文献+

基于MA2IDDPG算法的异构多无人机协同突防方法 被引量:5

Cooperative penetration method of heterogeneous multiple unmanned aerial vehicles based on multi-agent asynchronous imitative deep deterministic policy gradient algorithm
在线阅读 下载PDF
导出
摘要 为了形成智能化异构多无人机协同突防策略,提出了基于多智能体异步模仿深度确定性策略梯度算法的异构多无人机协同突防方法。首先,基于典型DDPG方法,采用异步并行框架进行改进,提高经验的收集效率。其次,分别构建基于专家经验知识的牵引性奖赏函数和基于任务结果的描述性奖赏函数。再次,通过分阶段训练后,使得深度神经网络产生的协同突防策略能够快速达到专家经验知识水平后,进一步提高对抗水平。最后,在仿真实验中,构建了异构多无人机协同突防环境,对改进方法和典型DDPG方法的对抗效果进行了对比。实验结果表明,MA2IDDPG方法能够有效产生多无人机协同突防策略,并且在训练阶段表现更为稳定且对抗效果更优。所设计的创新性MA2IDDPG方法框架可有效应用于无人机群控制领域,特别是对异构无人机群的智能控制方法研究具有重要参考意义。 In order to generate the intelligentized collaborative penetration strategy of heterogeneous multiple UAVs,a novel cooperative penetration method of heterogeneous multiple unmanned aerial vehicles based on multi-agent asynchronous imitative deep deterministic policy gradient algorithm(MA2IDDPG)was proposed.Firstly,asynchronous parallel framework was adopted based on the traditional DDPG method to improve the collection efficiency of experience data.Secondly,the attractant reward function based on expert experience knowledge and the descriptive reward function based on results were constructed respectively.Thirdly,through the phased training,the collaborative penetration strategy generated by the deep neural network could quickly reach the level of expert experience and knowledge,and the level of confrontation could be further improved.Finally,a heterogeneous multi-UAV cooperative penetration environment was constructed in simulation experiments,and the countermeasures performance of the improved method and the DDPG method was compared.The experiment results show that MA2IDDPG method can effectively generate the cooperative penetration strategy of multiple UAVs,and the performance of the proposed method is more stable and better than that of the traditional method in the training stage.The innovative MA2IDDPG method can be effectively applied in the field of UAV swarm control,especially for the study of intelligent control methods of heterogeneous UAV swarm.
作者 畅鑫 李艳斌 赵研 杜宇峰 刘东辉 CHANG Xin;LI Yanbin;ZHAO Yan;DU Yufeng;LIU Donghui(The 54th Research Institute of China Electronics Technology Group Corporation(CETC54),Shijiazhuang,Hebei 050081,China;Hebei Key Laboratory of Electromagnetic Spectrum Cognition and Control,Shijiazhuang,Hebei 050081,China;School of Economics and Management,Shijiazhuang Tiedao University,Shijiazhuang,Hebei 050043,China)
出处 《河北工业科技》 CAS 2022年第4期328-334,共7页 Hebei Journal of Industrial Science and Technology
基金 中国博士后科学基金(2021M693002)。
关键词 人工智能 多智能体系统 深度强化学习 多智能体深度确定性策略梯度 异步并行框架 共享经验池 分阶段学习 artificial intelligence multi-agent system deep reinforcement learning multi-agent deep deterministic policy gradient asynchronous parallel framework shared experience pool phased learning
  • 相关文献

参考文献15

二级参考文献87

共引文献163

同被引文献55

引证文献5

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部