Online three-dimensional(3D)path planning in dynamic environments is a fundamental problem for achieving autonomous navigation of unmanned aerial vehicles(UAVs).However,existing methods struggle to model traversable d...Online three-dimensional(3D)path planning in dynamic environments is a fundamental problem for achieving autonomous navigation of unmanned aerial vehicles(UAVs).However,existing methods struggle to model traversable dynamic gaps,resulting in conservative and suboptimal trajectories.To address these challenges,this paper proposes a hierarchical reinforcement learning(RL)framework that integrates global path guidance,local trajectory generation,predictive safety evaluation,and neural network-based decision-making.Specifically,the global planner provides long-term navigation guidance,and the local module then utilizes an improved 3D dynamic window approach(DWA)to generate dynamically feasible candidate trajectories.To enhance safety in dense dynamic scenarios,the algorithm introduces a predictive axis-aligned bounding box(AABB)strategy to model the future occupancy of obstacles,combined with convex hull verification for efficient trajectory safety assessment.Furthermore,a double deep Q-network(DDQN)is employed with structured feature encoding,enabling the neural network to reliably select the optimal trajectory from the candidate set,thereby improving robustness and generalization.Comparative experiments conducted in a high-fidelity simulation environment show that the algorithm outperforms existing algorithms,reducing the average number of collisions to 0.2 while shortening the average task completion time by approximately 15%,and achieving a success rate of 97%.展开更多
基金supported by the Postgraduate Research&Practice Innovation Program of Nanjing University of Aeronautics and Astronautics(NUAA)(No.xcxjh20251502)。
文摘Online three-dimensional(3D)path planning in dynamic environments is a fundamental problem for achieving autonomous navigation of unmanned aerial vehicles(UAVs).However,existing methods struggle to model traversable dynamic gaps,resulting in conservative and suboptimal trajectories.To address these challenges,this paper proposes a hierarchical reinforcement learning(RL)framework that integrates global path guidance,local trajectory generation,predictive safety evaluation,and neural network-based decision-making.Specifically,the global planner provides long-term navigation guidance,and the local module then utilizes an improved 3D dynamic window approach(DWA)to generate dynamically feasible candidate trajectories.To enhance safety in dense dynamic scenarios,the algorithm introduces a predictive axis-aligned bounding box(AABB)strategy to model the future occupancy of obstacles,combined with convex hull verification for efficient trajectory safety assessment.Furthermore,a double deep Q-network(DDQN)is employed with structured feature encoding,enabling the neural network to reliably select the optimal trajectory from the candidate set,thereby improving robustness and generalization.Comparative experiments conducted in a high-fidelity simulation environment show that the algorithm outperforms existing algorithms,reducing the average number of collisions to 0.2 while shortening the average task completion time by approximately 15%,and achieving a success rate of 97%.