期刊文献+

基于强化学习的多阶段资源分配对策模型

Reinforcement Learning⁃based Multi⁃period Game Theoretic Model for Resource Allocation
在线阅读 下载PDF
导出
摘要 针对资源受限下的攻防博弈资源分配问题,提出一种基于强化学习的多阶段攻防资源分配对策模型。防守者考虑如何在多阶段攻防中有效分配资源部署伪装目标以及加强真实目标防护,而多个进攻者考虑如何合作在多阶段攻防中有效分配资源识别伪装目标以及攻击真实目标。在各阶段以真实目标发挥期望效益为奖励准则,设计基于强化学习Q-learning算法的资源分配模型,生成整个周期内的攻防双方最优资源分配策略。示例研究验证了所提模型算法的有效性,能为多阶段攻防博弈资源分配提供辅助决策。 Aiming at the defense attacker game with limited resources,a multi-period game theoretic model based on reinforcement learning for resource allocation is proposed.The defender allocates resources in deploying false targets and strengthening the genuine one within multiple periods.Multiple attackers,on the other hand,distribute resources in identifying false targets and attack the genuine one among multiple targets.In each period,each player bases their decision on the expected utility of the genuine target as the reward.The Q-learning method,one of the reinforcement learning algorithms,is adopted in the game theoretic model,exploring the best resource allocation strategy over the entire planning horizon.An illustrative example was studied to demonstrate the effectiveness of the proposed model and algorithm,which can support the decision making in the resource allocation problems.
作者 张骁雄 丁松 彭锐 伍国华 刘忠 ZHANG Xiaoxiong;DING Song;PENG Rui;WU Guohua;LIU Zhong(The Sixty-third Research Institute,National University of Defense Technology,Nanjing 210007,China;Laboratory for Big Data and Decision,National University of Defense Technology,Changsha 410073,China;School of Economics,Zhejiang University of Finance&Economics,Hangzhou 310018,China;School of Economics&Management,Beijing University of Technology,Beijing 100124,China;School of Traffic&Transportation Engineering,Central South University,Changsha 410075,China)
出处 《同济大学学报(自然科学版)》 北大核心 2025年第6期985-992,共8页 Journal of Tongji University:Natural Science
基金 国家自然科学基金(72471236) 北京市科技新星资助项目(Z191100001119100) 中国科协(特殊领域)青年人才托举工程项目(2021-JCJQ-QT-050)。
关键词 资源分配 攻防博弈 伪装目标 强化学习 Q-LEARNING resource allocation attacker-defender game false targets reinforcement learning Q-learning
  • 相关文献

参考文献3

二级参考文献19

共引文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部