摘要
目标打击分配的核心任务是将有限的武器资源高效地分配给一系列威胁目标,以期最大化整体作战效能或最小化战损.针对传统静态评估方法难以适应对抗双方策略的动态演化,导致分配方案在强对抗场景下迅速失效的问题,提出了一种基于自博弈对抗学习(SPAL)的目标打击分配方法.结合双Actor-Critic网络,将目标打击分配策略融入强化学习框架,通过“进攻-防御”双方交替对抗、模型迭代升级的方式,从对抗任务场景中在线学习目标打击分配策略.实验结果表明,相较于基于规则和无自博弈对抗学习的方法,SPAL方法在任务完成率、己方存活率等性能指标上表现更好.
The core task of target strike allocation is to efficiently allocate limited weapon resources to a series of threat targets,with the hope of maximizing overall combat effectiveness or minimizing combat losses.However,the traditional static evaluation methods are difficult to adapt to the dynamic evolution of the strategies of the opposing sides,which results in the rapid failure of the allocation scheme in strong confrontation scenarios.In view of the above problems,this paper proposes a target strike allocation method based on self-play adversarial learning(SPAL).In the proposed scheme,with the dual-Actor-Critic network combined,the target strike allocation strategy is integrated into the reinforcement learning framework.By means of the alternating confrontation between the“offensive-defensive”sides and the iterative upgrade of the model,the target strike allocation strategy is learned online from the confrontation task scenarios.The experimental results show that compared with the rule-based and non-SPAL methods,the SPAL strategy performs better in performance indicators such as task completion rate and friendly survival rate.
作者
刘晓鹏
LIU Xiaopeng(No.95561 Unit,the PLA,Lhasa 850000,China)
出处
《空天预警研究学报》
2026年第1期62-68,共7页
JOURNAL OF AIR & SPACE EARLY WARNING RESEARCH
关键词
目标打击分配
自博弈
对抗学习
深度强化学习
target strike allocation
self-play
adversarial learning
deep reinforcement learning