摘要
随着O2O外卖行业的快速发展,动态订单分配与路径优化已成为提升配送效率的核心挑战。针对外卖即时配送问题中订单动态生成、备餐时间不确定、骑手与订单间的跨类型交互特性,提出一种基于异构图表示学习模块(Heterogeneous Graph Representation Learning,HGRL)与带有优先经验回放机制的决斗双深度Q网络(Dueling Double Deep Q Network with Prioritized Experience Replay,D3QNPER)算法相结合的模型。首先,本文将外卖配送系统建模为异构图,通过异构图注意力网络捕捉骑手节点与订单节点的交互关系,构建基于路径的马尔可夫决策过程(Markov Decision Process,MDP),以刻画动态决策场景。对比实验结果表明:D3QN-PER算法对应的平均客户服务水平更高,平均配送距离和配送时间更短,订单延迟率更低,且该算法具有更好的收敛性、训练稳定性和泛化性。
With the rapid development of the O2O(Online-to-Offline)food delivery industry,dynamic order assignment and route optimization have become core challenges in improving delivery efficiency.To address the characteristics of on-demand food delivery,including dynamically arriving orders,uncertain meal preparation times,and cross-type interactions between couriers and orders,this study proposes an integrated optimization model that combines a Heterogeneous Graph Representation Learning(HGRL)module with a Dueling Double Deep Q-Network with Prioritized Experience Replay(D3QN-PER)algorithm.First,the food delivery system is modeled as a heterogeneous graph,in which a heterogeneous graph attention network is employed to capture the interaction relationships between courier nodes and order nodes.A path-based Markov Decision Process(MDP)is then constructed to characterize the dynamic decision-making environment.Comparative experimental results demonstrate that the D3QN-PER algorithm achieves higher average customer service levels,shorter average delivery distances and delivery times,and lower order delay rates,while exhibiting better convergence performance,training stability,and generalization ability.
作者
张文强
黄永生
ZHANG Wenqiang;HUANG Yongsheng(North China University of Science and Technology,Tangshan,Hebei 063210,China)
出处
《物流技术》
2026年第1期57-67,共11页
Logistics Technology
基金
河北省教育厅重点社科项目“京津冀地区农产品物流资源管理与发展规划研究”(SD191014)
河北省科技厅重点研发计划项目“地铁车辆车顶智能检修机器人关键技术研究”(23311804D)。
关键词
外卖配送
深度学习
强化学习
异构图
马尔可夫决策
订单分配
路径规划
food delivery
deep learning
reinforcement learning
heterogeneous graph
Markov decision process
order assignment
route planning