期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
Improved Double Deep Q Network Algorithm Based on Average Q-Value Estimation and Reward Redistribution for Robot Path Planning
1
作者 Yameng Yin Lieping Zhang +3 位作者 Xiaoxu Shi Yilin Wang Jiansheng Peng Jianchu Zou 《Computers, Materials & Continua》 SCIE EI 2024年第11期2769-2790,共22页
By integrating deep neural networks with reinforcement learning,the Double Deep Q Network(DDQN)algorithm overcomes the limitations of Q-learning in handling continuous spaces and is widely applied in the path planning... By integrating deep neural networks with reinforcement learning,the Double Deep Q Network(DDQN)algorithm overcomes the limitations of Q-learning in handling continuous spaces and is widely applied in the path planning of mobile robots.However,the traditional DDQN algorithm suffers from sparse rewards and inefficient utilization of high-quality data.Targeting those problems,an improved DDQN algorithm based on average Q-value estimation and reward redistribution was proposed.First,to enhance the precision of the target Q-value,the average of multiple previously learned Q-values from the target Q network is used to replace the single Q-value from the current target Q network.Next,a reward redistribution mechanism is designed to overcome the sparse reward problem by adjusting the final reward of each action using the round reward from trajectory information.Additionally,a reward-prioritized experience selection method is introduced,which ranks experience samples according to reward values to ensure frequent utilization of high-quality data.Finally,simulation experiments are conducted to verify the effectiveness of the proposed algorithm in fixed-position scenario and random environments.The experimental results show that compared to the traditional DDQN algorithm,the proposed algorithm achieves shorter average running time,higher average return and fewer average steps.The performance of the proposed algorithm is improved by 11.43%in the fixed scenario and 8.33%in random environments.It not only plans economic and safe paths but also significantly improves efficiency and generalization in path planning,making it suitable for widespread application in autonomous navigation and industrial automation. 展开更多
关键词 double deep q network path planning average q-value estimation reward redistribution mechanism reward-prioritized experience selection method
在线阅读 下载PDF
Improved Double Deep Q Network-Based Task Scheduling Algorithm in Edge Computing for Makespan Optimization 被引量:3
2
作者 Lei Zeng Qi Liu +1 位作者 Shigen Shen Xiaodong Liu 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第3期806-817,共12页
Edge computing nodes undertake an increasing number of tasks with the rise of business density.Therefore,how to efficiently allocate large-scale and dynamic workloads to edge computing resources has become a critical ... Edge computing nodes undertake an increasing number of tasks with the rise of business density.Therefore,how to efficiently allocate large-scale and dynamic workloads to edge computing resources has become a critical challenge.This study proposes an edge task scheduling approach based on an improved Double Deep Q Network(DQN),which is adopted to separate the calculations of target Q values and the selection of the action in two networks.A new reward function is designed,and a control unit is added to the experience replay unit of the agent.The management of experience data are also modified to fully utilize its value and improve learning efficiency.Reinforcement learning agents usually learn from an ignorant state,which is inefficient.As such,this study proposes a novel particle swarm optimization algorithm with an improved fitness function,which can generate optimal solutions for task scheduling.These optimized solutions are provided for the agent to pre-train network parameters to obtain a better cognition level.The proposed algorithm is compared with six other methods in simulation experiments.Results show that the proposed algorithm outperforms other benchmark methods regarding makespan. 展开更多
关键词 edge computing task scheduling reinforcement learning MAKESPAN double deep q network(DqN)
原文传递
基于文件工作流和强化学习的工程项目文件管理优化方法
3
作者 司鹏搏 庞睿 +2 位作者 杨睿哲 孙艳华 李萌 《北京工业大学学报》 北大核心 2025年第10期1162-1170,共9页
为了解决大型工程项目中文件的传输时间与成本问题,提出一个基于文件工作流的工程项目文件管理优化方法。首先,构建了工程项目文件管理环境和具有逻辑顺序的文件工作流模型,分析了文件的传输和缓存。在此基础上,将文件管理优化问题建模... 为了解决大型工程项目中文件的传输时间与成本问题,提出一个基于文件工作流的工程项目文件管理优化方法。首先,构建了工程项目文件管理环境和具有逻辑顺序的文件工作流模型,分析了文件的传输和缓存。在此基础上,将文件管理优化问题建模为马尔可夫过程,通过设计状态空间、动作空间及奖励函数等实现文件工作流的任务完成时间与缓存成本的联合优化。其次,采用对抗式双重深度Q网络(dueling double deep Q network,D3QN)来降低训练时间,提高训练效率。仿真结果验证了提出方案在不同参数配置下文件传输的有效性,并且在任务体量增大时仍能保持较好的优化能力。 展开更多
关键词 文件工作流 传输时间 马尔可夫过程 对抗式双重深度q网络(dueling double deep q network D3qN) 文件管理 联合优化
在线阅读 下载PDF
Deep reinforcement learning for UAV swarm rendezvous behavior 被引量:2
4
作者 ZHANG Yaozhong LI Yike +1 位作者 WU Zhuoran XU Jialin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2023年第2期360-373,共14页
The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the mai... The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the main trends of UAV development in the future.This paper studies the behavior decision-making process of UAV swarm rendezvous task based on the double deep Q network(DDQN)algorithm.We design a guided reward function to effectively solve the problem of algorithm convergence caused by the sparse return problem in deep reinforcement learning(DRL)for the long period task.We also propose the concept of temporary storage area,optimizing the memory playback unit of the traditional DDQN algorithm,improving the convergence speed of the algorithm,and speeding up the training process of the algorithm.Different from traditional task environment,this paper establishes a continuous state-space task environment model to improve the authentication process of UAV task environment.Based on the DDQN algorithm,the collaborative tasks of UAV swarm in different task scenarios are trained.The experimental results validate that the DDQN algorithm is efficient in terms of training UAV swarm to complete the given collaborative tasks while meeting the requirements of UAV swarm for centralization and autonomy,and improving the intelligence of UAV swarm collaborative task execution.The simulation results show that after training,the proposed UAV swarm can carry out the rendezvous task well,and the success rate of the mission reaches 90%. 展开更多
关键词 double deep q network(DDqN)algorithms unmanned aerial vehicle(UAV)swarm task decision deep reinforcement learning(DRL) sparse returns
在线阅读 下载PDF
Safe Navigation for UAV-Enabled Data Dissemination by Deep Reinforcement Learning in Unknown Environments 被引量:1
5
作者 Fei Huang Guangxia Li +3 位作者 Shiwei Tian Jin Chen Guangteng Fan Jinghui Chang 《China Communications》 SCIE CSCD 2022年第1期202-217,共16页
Unmanned aerial vehicles(UAVs) are increasingly considered in safe autonomous navigation systems to explore unknown environments where UAVs are equipped with multiple sensors to perceive the surroundings. However, how... Unmanned aerial vehicles(UAVs) are increasingly considered in safe autonomous navigation systems to explore unknown environments where UAVs are equipped with multiple sensors to perceive the surroundings. However, how to achieve UAVenabled data dissemination and also ensure safe navigation synchronously is a new challenge. In this paper, our goal is minimizing the whole weighted sum of the UAV’s task completion time while satisfying the data transmission task requirement and the UAV’s feasible flight region constraints. However, it is unable to be solved via standard optimization methods mainly on account of lacking a tractable and accurate system model in practice. To overcome this tough issue,we propose a new solution approach by utilizing the most advanced dueling double deep Q network(dueling DDQN) with multi-step learning. Specifically, to improve the algorithm, the extra labels are added to the primitive states. Simulation results indicate the validity and performance superiority of the proposed algorithm under different data thresholds compared with two other benchmarks. 展开更多
关键词 Unmanned aerial vehicles(UAVs) safe autonomous navigation unknown environments data dissemination dueling double deep q network(dueling DDqN)
在线阅读 下载PDF
Tactical conflict resolution in urban airspace for unmanned aerial vehicles operations using attention-based deep reinforcement learning 被引量:1
6
作者 Mingcheng Zhang Chao Yan +2 位作者 Wei Dai Xiaojia Xiang Kin Huat Low 《Green Energy and Intelligent Transportation》 2023年第4期43-57,共15页
Unmanned aerial vehicles(UAVs)have gained much attention from academic and industrial areas due to the significant number of potential applications in urban airspace.A traffic management system for these UAVs is neede... Unmanned aerial vehicles(UAVs)have gained much attention from academic and industrial areas due to the significant number of potential applications in urban airspace.A traffic management system for these UAVs is needed to manage this future traffic.Tactical conflict resolution for unmanned aerial systems(UASs)is an essential piece of the puzzle for the future UAS Traffic Management(UTM),especially in very low-level(VLL)urban airspace.Unlike conflict resolution in higher altitude airspace,the dense high-rise buildings are an essential source of potential conflict to be considered in VLL urban airspace.In this paper,we propose an attention-based deep reinforcement learning approach to solve the tactical conflict resolution problem.Specifically,we formulate this task as a sequential decision-making problem using Markov Decision Process(MDP).The double deep Q network(DDQN)framework is used as a learning framework for the host drone to learn to output conflict-free maneuvers at each time step.We use the attention mechanism to model the individual neighbor's effect on the host drone,endowing the learned conflict resolution policy to be adapted to an arbitrary number of neighboring drones.Lastly,we build a simulation environment with various scenarios covering different types of encounters to evaluate the proposed approach.The simulation results demonstrate that our proposed algorithm provides a reliable solution to minimize secondary conflict counts compared to learning and non-learning-based approaches under different traffic density scenarios. 展开更多
关键词 Unmanned aircraft system traffic management Tactical conflict resolution double deep q network Attention mechanism Secondary conflict
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部