基于强化学习的无人机乡村末端配送任务分配

Task allocation of unmanned aerial vehicle for rural last-mile delivery based on reinforcement learning

下载PDF

导出

摘要农村“最后一公里”配送难、时间长和成本高的特点使高效精准的末端配送调度方案尤为重要。针对农村配送场景下的多物流无人机(UAV)的任务分配问题,综合考虑UAV的载重量和UAV的最大飞行距离,以最小化UAV的飞行距离、派遣数量和不违反时间窗为目标,建立多目标的UAV任务分配模型。首先,以强化学习为基础,针对任务分配维数过高的问题,引入编码器和注意力机制有效简化状态空间;其次,结合全局-局部搜索策略,在探索解空间的同时避免陷入局部最优解,从而提高求解质量;最后,进一步分析参数权重设置,并且经实验得出各子目标函数权重系数的最优组合。仿真结果表明,在得到的最终路径长度上相较于混合Q学习网络方法(HQM)、自适应大邻域搜索算法(ALNS)、Q学习算法(Q-learning)和遗传算法(GA),所提算法SG-HQM(Sine and Gaussian HQM)分别减少了8.35%、9.88%、10.29%和12.48%。 The difficulty,long delivery time,and high cost of last-mile delivery in rural areas make efficient and accurate last-mile delivery scheduling solutions particularly important.Aiming at the task allocation problem of multiple logistics Unmanned Aerial Vehicles(UAVs)in rural distribution scenarios,a multi-objective UAV task allocation model was established by considering the payload capacity of UAVs and the maximum flight distance of UAVs comprehensively,with the goal of minimizing the flight distance,dispatched quantity of UAVs and not violating time windows.Firstly,based on reinforcement learning,to address the problem of high dimensionality in task allocation,an encoder and attention mechanism were introduced to simplify the state space effectively.Secondly,the global-local search strategy was combined to explore the solution space while avoiding getting stuck in the local optimum,thereby improving the quality of the solution.Finally,further analysis was conducted on the parameter weight settings,and the optimal combination of weight coefficients for sub-objective functions was obtained through experiments.Simulation results show that compared to the Hybrid Q-learning network based Method(HQM),Adaptive Large Neighborhood Search algorithm(ALNS),Q-learning algorithm(Q-learning),and Genetic Algorithm(GA)in terms of the obtained final path length,the proposed algorithm SG-HQM(Sine and Gaussian HQM)reduced it by 8.35%,9.88%,10.29%,and 12.48%,respectively.

作者陈晓娟张薇 CHEN Xiaojuan;ZHANG Wei(College of Information and Communication Engineering,Harbin Engineering University,Harbin Heilongjiang 150001,China)

机构地区哈尔滨工程大学信息与通信工程学院

出处《计算机应用》北大核心 2025年第12期4055-4063,共9页 journal of Computer Applications

基金电子信息系统复杂电磁环境效应国家重点实验室资助课题(CEMEE2021K0101A)。

关键词末端配送任务分配强化学习无人机多目标优化 last-mile delivery task allocation reinforcement learning Unmanned Aerial Vehicle(UAV) multi-objective optimization

分类号 TP302 [自动化与计算机技术—计算机系统结构]