通信受限下的协作式多智能体强化学习方法

Cooperative Multi-Agent Reinforcement Learning Methods Under Communication Constraints

下载PDF

导出

摘要针对现有大多协作式多智能体强化学习方法存在通信假设过于理想化问题,提出通信受限下的协作式多智能体强化学习方法。首先,通过引入随机信息丢失与高斯白噪声扰动,构建更贴近实际的通信受限环境;其次,提出一种基于残差连接的价值分解方法,利用残差结构增强系统对通信质量波动与观测噪声的鲁棒性;最后,在基于星际争霸多智能体挑战平台所构建的通信受限测试环境中对文中方法进行验证。实验结果表明:文中方法在多种通信受限场景下均表现优异,性能显著优于当前主流的多智能体强化学习方法。 To address the problem that most existing collaborative multi-agent reinforcement learning methods adopt overly idealized communication assumptions,a cooperative multi-agent reinforcement learning methods under communication constraints is proposed.First,a more realistic communication constrained environment is constructed by introducing random information loss and additive Gaussian white noise disturbance.Then,a residual connection-based value decomposition method is proposed,leveraging residual structures to enhance the robustness of system against communication quality fluctuations and observational noise.Finally,the proposed method is validated in a communication constrained test environment built on the StarCraft multi-agent challenge benchmark.Experimental results show that the proposed method performs excellently under various communication-constrained scenarios,significantly outperforming current mainstream multi-agent reinforcement learning methods.

作者胡小亮林雨婷郭鹏程黄世梅陈叶旺 HU Xiaoliang;LIN Yuting;GUO Pengcheng;HUANG Shimei;CHEN Yewang(College of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210014,China;College of Computer Science and Technology,Huaqiao University,Xiamen 361021,China)

机构地区南京理工大学计算机科学与工程学院华侨大学计算机科学与技术学院

出处《华侨大学学报(自然科学版)》 2026年第2期193-201,共9页 Journal of Huaqiao University(Natural Science)

基金福建省厦门市产学基金资助项目(2024CXY0237)。

关键词通信受限协作式多智能体强化学习残差连接价值分解 communication constraint cooperative multi-agent reinforcement learning residual connection value decomposition

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

1杨直.游戏衍生品,不只是消费[J].电子竞技,2025(12):64-67.
2段鹏婷,温超,王保平,王珍妮.基于协作语义融合的多智能体行为决策方法[J].计算机科学,2026,53(1):252-261.
3姚志强,陈志鹏,张文文,曹学萌,于澎.移动射频测距辅助的卫星伪距误差抑制方法[J].全球定位系统,2026,51(1):49-57.
4蔡成林,覃炫文,关文绘.BDS三频数据的周跳探测方法研究[J].大地测量与地球动力学,2026,46(3):273-280.
5陈洪放,王秋红,顾晶晶,张凯.基于策略优化和表征搜索的改进多智能体进化强化学习[J].计算机系统应用,2025,34(12):26-38.

华侨大学学报(自然科学版)

2026年第2期

浏览历史

内容加载中请稍等...

通信受限下的协作式多智能体强化学习方法

相关作者

相关机构

相关主题

浏览历史