摘要
针对无人机(UAV)在应急场景中执行数据采集任务时,其电池容量有限、缓存空间有限,以及地面目标优先级动态变化所导致的无人机航迹规划与资源分配效果较差的问题,提出一种基于深度强化学习的无人机航迹规划与资源分配联合优化方法。考虑无人机任务中的通信、计算、飞行、数据缓存过程,构建相应的数学问题模型;针对无人机航迹规划与资源分配问题构建马尔可夫过程模型,设计相应的状态和行为描述及用于平衡无人机能耗和数据采集信息量的加权奖励函数;与贪婪算法和遗传算法等智能优化方法进行仿真对比。结果表明:所提方法能够使无人机在较短的任务时间内,以消耗相似或者较低的能量为代价,较大提升对地面用户的数据采集量。
A joint optimization method for unmanned aerial vehicle(UAV)trajectory planning and resource allocation based on deep reinforcement learning was proposed to address the challenges of limited battery capacity,limited cache space,and dynamic changes in ground target priorities during data collection tasks in emergency scenarios.First,a mathematical model was developed by considering the communication,computation,flight,and data caching processes in UAV missions.Then,a Markov process model was established for UAV trajectory planning and resource allocation,with corresponding state and action descriptions.A weighted reward function was designed to balance UAV energy consumption and data collection volume.Finally,simulations were conducted to compare the proposed method with greedy algorithms and genetic algorithms.The results show that the proposed method can significantly improve the amount of data collected from ground users within a shorter task time,at a similar or lower energy cost for UAVs.
作者
雷耀麟
丁文锐
罗祎喆
王玉峰
刘思琪
张芷兰
LEI Yaolin;DING Wenrui;LUO Yizhe;WANG Yufeng;LIU Siqi;ZHANG Zhilan(School of Electronics and Information Engineering,Beihang University,Beijing 100191,China;China Electronics Technology Group Corporation 54th Research Institute,Shijiazhuang 050081,China;Institute of Unmanned System,Beihang University,Beijing 100191,China;School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450053,China;State Grid Xinxiang Electric Power Supply Company,Xinxiang 453000,China)
出处
《北京航空航天大学学报》
北大核心
2025年第10期3460-3470,共11页
Journal of Beijing University of Aeronautics and Astronautics
基金
国家自然科学基金企业创新发展联合基金(U20B2042)。
关键词
无人机
资源分配
航迹规划
强化学习
移动边缘计算
unmanned aerial vehicle
resource allocation
trajectory planning
reinforcement learning
mobile edge computing