期刊文献+

一类三维装箱问题的多智能体分层强化学习求解算法研究

Research on multi-agent hierarchical reinforcement learning algorithm for solving one type of 3D bin packing problem
在线阅读 下载PDF
导出
摘要 针对半在线场景下的多箱体三维装箱问题(3D-BPP),为了提高装箱决策效率和装箱空间利用率,本文提出一种多智能体分层强化学习算法.该算法采用多智能体马尔可夫决策过程(MAMDP)对问题进行建模,通过3个完全合作的智能体分别负责货物选择、箱子选择和摆放位置规划,并引入值分布学习方法以增强算法的稳定性和收敛性.实验结果表明,该算法在不同环境配置下均表现出良好的性能,空间利用率和装入货物数量显著提升,且在多箱体和多货物选择场景下展现出较强的泛化能力.与传统的启发式算法相比,该算法在动态决策和适应性方面具有明显优势,尤其在处理未知分布的货物尺寸时表现出较强的鲁棒性.该算法首次将多智能体分层强化学习框架应用于3D-BPP,实现装箱决策的端到端优化,为复杂装箱场景提供了一种新颖的解决方案. With consideration of the complexity of the three-dimensional bin packing problem(3D-BPP)in the multibin semi-online scenarios,a multi-agent hierarchical reinforcement learning algorithm is proposed to improve packing efficiency and space utilization.The proposed algorithm models the problem by using a multi-agent Markov decision process(MAMDP),including three fully cooperative agents responsible for item selection,bin selection,and placement planning,respectively.A distributional learning method is introduced to enhance the stability and convergence of the algorithm.Experimental results demonstrate that the algorithm exhibits superior packing performance across various environmental configurations,significantly improving space utilization and the number of packed items.It also shows strong generalization capabilities in multi-bin and multi-item selection scenarios.Compared to traditional heuristic algorithms,the proposed method has clear advantages in dynamic decision-making and adaptive optimization,particularly demonstrating robustness when handling items with unknown size distributions.The innovation lies in the first application of a multi-agent hierarchical reinforcement learning framework to the 3D-BPP,achieving end-to-end optimization of packing decisions and providing a novel solution for complex packing scenarios.
作者 初阳 燕雪峰 张玄烨 徐云雯 李德伟 CHU Yang;YAN Xue-feng;ZHANG Xuan-ye;XU Yun-wen;LI De-wei(School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing Jiangsu 211106,China;Jiangsu Automation Research Institute,Lianyungang Jiangsu 222061,China;Collaborative Innovation Center of Novel Software Technology and Industrialization.Nanjing Jiangsu 210023,China;Department of Automation,Shanghai Jiao Tong University,Shanghai 200240,China)
出处 《控制理论与应用》 北大核心 2025年第12期2569-2576,共8页 Control Theory & Applications
关键词 三维装箱问题 深度强化学习 多智能体强化学习 组合优化 three-dimensional packing problem deep reinforcement learning multi-agent reinforcement learning combinatorial optimization
  • 相关文献

参考文献2

二级参考文献13

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部