期刊文献+
共找到19篇文章
< 1 >
每页显示 20 50 100
改进Deep Q Networks的交通信号均衡调度算法
1
作者 贺道坤 《机械设计与制造》 北大核心 2025年第4期135-140,共6页
为进一步缓解城市道路高峰时段十字路口的交通拥堵现象,实现路口各道路车流均衡通过,基于改进Deep Q Networks提出了一种的交通信号均衡调度算法。提取十字路口与交通信号调度最相关的特征,分别建立单向十字路口交通信号模型和线性双向... 为进一步缓解城市道路高峰时段十字路口的交通拥堵现象,实现路口各道路车流均衡通过,基于改进Deep Q Networks提出了一种的交通信号均衡调度算法。提取十字路口与交通信号调度最相关的特征,分别建立单向十字路口交通信号模型和线性双向十字路口交通信号模型,并基于此构建交通信号调度优化模型;针对Deep Q Networks算法在交通信号调度问题应用中所存在的收敛性、过估计等不足,对Deep Q Networks进行竞争网络改进、双网络改进以及梯度更新策略改进,提出相适应的均衡调度算法。通过与经典Deep Q Networks仿真比对,验证论文算法对交通信号调度问题的适用性和优越性。基于城市道路数据,分别针对两种场景进行仿真计算,仿真结果表明该算法能够有效缩减十字路口车辆排队长度,均衡各路口车流通行量,缓解高峰出行方向的道路拥堵现象,有利于十字路口交通信号调度效益的提升。 展开更多
关键词 交通信号调度 十字路口 deep q networks 深度强化学习 智能交通
在线阅读 下载PDF
基于Deep Q Networks的交通指示灯控制方法 被引量:2
2
作者 颜文胜 吕红兵 《计算机测量与控制》 2021年第6期93-97,共5页
交通指示灯的智能控制是当前智能交通研究中的热点问题;为更加及时有效地自适应动态交通,进一步提升街道路口车流效率,提出了一种基于Deep Q Networks的道路指示灯控制方法;该方法基于道路指示灯控制问题描述,以状态、行动和奖励三要素... 交通指示灯的智能控制是当前智能交通研究中的热点问题;为更加及时有效地自适应动态交通,进一步提升街道路口车流效率,提出了一种基于Deep Q Networks的道路指示灯控制方法;该方法基于道路指示灯控制问题描述,以状态、行动和奖励三要素构建道路指示灯控制的强化学习模型,提出基于Deep Q Networks的道路指示控制方法流程;为检验方法的有效性,以浙江省台州市市府大道与东环大道交叉路口交通数据在SUMO中进行方法比对与仿真实验;实验结果表明,基于Deep Q Networks的交通指示灯控制方法在交通指示等的控制与调度中具有更高的效率和自主性,更有利于改善路口车流的吞吐量,对道路路口车流的驻留时延、队列长度和等待时间等方面的优化具有更好的性能。 展开更多
关键词 道路指示灯 deep q networks 智能交通 信号控制
在线阅读 下载PDF
基于Deep Q Networks的机械臂推动和抓握协同控制 被引量:3
3
作者 贺道坤 《现代制造工程》 CSCD 北大核心 2021年第7期23-28,共6页
针对目前机械臂在复杂场景应用不足以及推动和抓握自主协同控制研究不多的现状,发挥深度Q网络(Deep Q Networks)无规则、自主学习优势,提出了一种基于Deep Q Networks的机械臂推动和抓握协同控制方法。通过2个完全卷积网络将场景信息映... 针对目前机械臂在复杂场景应用不足以及推动和抓握自主协同控制研究不多的现状,发挥深度Q网络(Deep Q Networks)无规则、自主学习优势,提出了一种基于Deep Q Networks的机械臂推动和抓握协同控制方法。通过2个完全卷积网络将场景信息映射至推动或抓握动作,经过马尔可夫过程,采取目光长远奖励机制,选取最佳行为函数,实现对复杂场景机械臂推动和抓握动作的自主协同控制。在仿真和真实场景实验中,该方法在复杂场景中能够通过推动和抓握自主协同操控实现对物块的快速抓取,并获得更高的动作效率和抓取成功率。 展开更多
关键词 机械臂 抓握 推动 深度q网络(deep q networks) 协同控制
在线阅读 下载PDF
Improved Double Deep Q Network Algorithm Based on Average Q-Value Estimation and Reward Redistribution for Robot Path Planning
4
作者 Yameng Yin Lieping Zhang +3 位作者 Xiaoxu Shi Yilin Wang Jiansheng Peng Jianchu Zou 《Computers, Materials & Continua》 SCIE EI 2024年第11期2769-2790,共22页
By integrating deep neural networks with reinforcement learning,the Double Deep Q Network(DDQN)algorithm overcomes the limitations of Q-learning in handling continuous spaces and is widely applied in the path planning... By integrating deep neural networks with reinforcement learning,the Double Deep Q Network(DDQN)algorithm overcomes the limitations of Q-learning in handling continuous spaces and is widely applied in the path planning of mobile robots.However,the traditional DDQN algorithm suffers from sparse rewards and inefficient utilization of high-quality data.Targeting those problems,an improved DDQN algorithm based on average Q-value estimation and reward redistribution was proposed.First,to enhance the precision of the target Q-value,the average of multiple previously learned Q-values from the target Q network is used to replace the single Q-value from the current target Q network.Next,a reward redistribution mechanism is designed to overcome the sparse reward problem by adjusting the final reward of each action using the round reward from trajectory information.Additionally,a reward-prioritized experience selection method is introduced,which ranks experience samples according to reward values to ensure frequent utilization of high-quality data.Finally,simulation experiments are conducted to verify the effectiveness of the proposed algorithm in fixed-position scenario and random environments.The experimental results show that compared to the traditional DDQN algorithm,the proposed algorithm achieves shorter average running time,higher average return and fewer average steps.The performance of the proposed algorithm is improved by 11.43%in the fixed scenario and 8.33%in random environments.It not only plans economic and safe paths but also significantly improves efficiency and generalization in path planning,making it suitable for widespread application in autonomous navigation and industrial automation. 展开更多
关键词 Double deep q Network path planning average q-value estimation reward redistribution mechanism reward-prioritized experience selection method
在线阅读 下载PDF
Relay Selection for Cooperative NOMA Systems Based on the DQN Algorithm
5
作者 Ying Lin Yongwei Xiong +2 位作者 Xingbo Gong Sifei Zhang Yinhang Tian 《Journal of Beijing Institute of Technology》 2025年第3期303-315,共13页
In this study,a solution based on deep Q network(DQN)is proposed to address the relay selection problem in cooperative non-orthogonal multiple access(NOMA)systems.DQN is particularly effective in addressing problems w... In this study,a solution based on deep Q network(DQN)is proposed to address the relay selection problem in cooperative non-orthogonal multiple access(NOMA)systems.DQN is particularly effective in addressing problems within dynamic and complex communication environ-ments.By formulating the relay selection problem as a Markov decision process(MDP),the DQN algorithm employs deep neural networks(DNNs)to learn and make decisions through real-time interactions with the communication environment,aiming to minimize the system’s outage proba-bility.During the learning process,the DQN algorithm progressively acquires channel state infor-mation(CSI)between two nodes,thereby minimizing the system’s outage probability until a sta-ble level is reached.Simulation results show that the proposed method effectively reduces the out-age probability by 82%compared to the two-way relay selection scheme(Two-Way)when the sig-nal-to-noise ratio(SNR)is 30 dB.This study demonstrates the applicability and advantages of the DQN algorithm in cooperative NOMA systems,providing a novel approach to addressing real-time relay selection challenges in dynamic communication environments. 展开更多
关键词 deep q network(DqN) cooperative non-orthogonal multiple access(NOMA) relay selection outage probability
在线阅读 下载PDF
Artificial Potential Field Incorporated Deep-Q-Network Algorithm for Mobile Robot Path Prediction 被引量:3
6
作者 A.Sivaranjani B.Vinod 《Intelligent Automation & Soft Computing》 SCIE 2023年第1期1135-1150,共16页
Autonomous navigation of mobile robots is a challenging task that requires them to travel from their initial position to their destination without collision in an environment.Reinforcement Learning methods enable a st... Autonomous navigation of mobile robots is a challenging task that requires them to travel from their initial position to their destination without collision in an environment.Reinforcement Learning methods enable a state action function in mobile robots suited to their environment.During trial-and-error interaction with its surroundings,it helps a robot tofind an ideal behavior on its own.The Deep Q Network(DQN)algorithm is used in TurtleBot 3(TB3)to achieve the goal by successfully avoiding the obstacles.But it requires a large number of training iterations.This research mainly focuses on a mobility robot’s best path prediction utilizing DQN and the Artificial Potential Field(APF)algorithms.First,a TB3 Waffle Pi DQN is built and trained to reach the goal.Then the APF shortest path algorithm is incorporated into the DQN algorithm.The proposed planning approach is compared with the standard DQN method in a virtual environment based on the Robot Operation System(ROS).The results from the simulation show that the combination is effective for DQN and APF gives a better optimal path and takes less time when compared to the conventional DQN algo-rithm.The performance improvement rate of the proposed DQN+APF in comparison with DQN in terms of the number of successful targets is attained by 88%.The performance of the proposed DQN+APF in comparison with DQN in terms of average time is achieved by 0.331 s.The performance of the proposed DQN+APF in comparison with DQN average rewards in which the positive goal is attained by 85%and the negative goal is attained by-90%. 展开更多
关键词 Artificial potentialfield deep reinforcement learning mobile robot turtle bot deep q network path prediction
在线阅读 下载PDF
基于文件工作流和强化学习的工程项目文件管理优化方法
7
作者 司鹏搏 庞睿 +2 位作者 杨睿哲 孙艳华 李萌 《北京工业大学学报》 北大核心 2025年第10期1162-1170,共9页
为了解决大型工程项目中文件的传输时间与成本问题,提出一个基于文件工作流的工程项目文件管理优化方法。首先,构建了工程项目文件管理环境和具有逻辑顺序的文件工作流模型,分析了文件的传输和缓存。在此基础上,将文件管理优化问题建模... 为了解决大型工程项目中文件的传输时间与成本问题,提出一个基于文件工作流的工程项目文件管理优化方法。首先,构建了工程项目文件管理环境和具有逻辑顺序的文件工作流模型,分析了文件的传输和缓存。在此基础上,将文件管理优化问题建模为马尔可夫过程,通过设计状态空间、动作空间及奖励函数等实现文件工作流的任务完成时间与缓存成本的联合优化。其次,采用对抗式双重深度Q网络(dueling double deep Q network,D3QN)来降低训练时间,提高训练效率。仿真结果验证了提出方案在不同参数配置下文件传输的有效性,并且在任务体量增大时仍能保持较好的优化能力。 展开更多
关键词 文件工作流 传输时间 马尔可夫过程 对抗式双重深度q网络(dueling double deep q network D3qN) 文件管理 联合优化
在线阅读 下载PDF
DQN-Based Proactive Trajectory Planning of UAVs in Multi-Access Edge Computing 被引量:2
8
作者 Adil Khan Jinling Zhang +3 位作者 Shabeer Ahmad Saifullah Memon Babar Hayat Ahsan Rafiq 《Computers, Materials & Continua》 SCIE EI 2023年第3期4685-4702,共18页
The main aim of future mobile networks is to provide secure,reliable,intelligent,and seamless connectivity.It also enables mobile network operators to ensure their customer’s a better quality of service(QoS).Nowadays... The main aim of future mobile networks is to provide secure,reliable,intelligent,and seamless connectivity.It also enables mobile network operators to ensure their customer’s a better quality of service(QoS).Nowadays,Unmanned Aerial Vehicles(UAVs)are a significant part of the mobile network due to their continuously growing use in various applications.For better coverage,cost-effective,and seamless service connectivity and provisioning,UAVs have emerged as the best choice for telco operators.UAVs can be used as flying base stations,edge servers,and relay nodes in mobile networks.On the other side,Multi-access EdgeComputing(MEC)technology also emerged in the 5G network to provide a better quality of experience(QoE)to users with different QoS requirements.However,UAVs in a mobile network for coverage enhancement and better QoS face several challenges such as trajectory designing,path planning,optimization,QoS assurance,mobilitymanagement,etc.The efficient and proactive path planning and optimization in a highly dynamic environment containing buildings and obstacles are challenging.So,an automated Artificial Intelligence(AI)enabled QoSaware solution is needed for trajectory planning and optimization.Therefore,this work introduces a well-designed AI and MEC-enabled architecture for a UAVs-assisted future network.It has an efficient Deep Reinforcement Learning(DRL)algorithm for real-time and proactive trajectory planning and optimization.It also fulfills QoS-aware service provisioning.A greedypolicy approach is used to maximize the long-term reward for serving more users withQoS.Simulation results reveal the superiority of the proposed DRL mechanism for energy-efficient and QoS-aware trajectory planning over the existing models. 展开更多
关键词 Multi-access edge computing UAVS trajectory planning qoS assurance reinforcement learning deep q network
在线阅读 下载PDF
Deep reinforcement learning for UAV swarm rendezvous behavior 被引量:2
9
作者 ZHANG Yaozhong LI Yike +1 位作者 WU Zhuoran XU Jialin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2023年第2期360-373,共14页
The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the mai... The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the main trends of UAV development in the future.This paper studies the behavior decision-making process of UAV swarm rendezvous task based on the double deep Q network(DDQN)algorithm.We design a guided reward function to effectively solve the problem of algorithm convergence caused by the sparse return problem in deep reinforcement learning(DRL)for the long period task.We also propose the concept of temporary storage area,optimizing the memory playback unit of the traditional DDQN algorithm,improving the convergence speed of the algorithm,and speeding up the training process of the algorithm.Different from traditional task environment,this paper establishes a continuous state-space task environment model to improve the authentication process of UAV task environment.Based on the DDQN algorithm,the collaborative tasks of UAV swarm in different task scenarios are trained.The experimental results validate that the DDQN algorithm is efficient in terms of training UAV swarm to complete the given collaborative tasks while meeting the requirements of UAV swarm for centralization and autonomy,and improving the intelligence of UAV swarm collaborative task execution.The simulation results show that after training,the proposed UAV swarm can carry out the rendezvous task well,and the success rate of the mission reaches 90%. 展开更多
关键词 double deep q network(DDqN)algorithms unmanned aerial vehicle(UAV)swarm task decision deep reinforcement learning(DRL) sparse returns
在线阅读 下载PDF
Safe Navigation for UAV-Enabled Data Dissemination by Deep Reinforcement Learning in Unknown Environments 被引量:1
10
作者 Fei Huang Guangxia Li +3 位作者 Shiwei Tian Jin Chen Guangteng Fan Jinghui Chang 《China Communications》 SCIE CSCD 2022年第1期202-217,共16页
Unmanned aerial vehicles(UAVs) are increasingly considered in safe autonomous navigation systems to explore unknown environments where UAVs are equipped with multiple sensors to perceive the surroundings. However, how... Unmanned aerial vehicles(UAVs) are increasingly considered in safe autonomous navigation systems to explore unknown environments where UAVs are equipped with multiple sensors to perceive the surroundings. However, how to achieve UAVenabled data dissemination and also ensure safe navigation synchronously is a new challenge. In this paper, our goal is minimizing the whole weighted sum of the UAV’s task completion time while satisfying the data transmission task requirement and the UAV’s feasible flight region constraints. However, it is unable to be solved via standard optimization methods mainly on account of lacking a tractable and accurate system model in practice. To overcome this tough issue,we propose a new solution approach by utilizing the most advanced dueling double deep Q network(dueling DDQN) with multi-step learning. Specifically, to improve the algorithm, the extra labels are added to the primitive states. Simulation results indicate the validity and performance superiority of the proposed algorithm under different data thresholds compared with two other benchmarks. 展开更多
关键词 Unmanned aerial vehicles(UAVs) safe autonomous navigation unknown environments data dissemination dueling double deep q network(dueling DDqN)
在线阅读 下载PDF
Improved Double Deep Q Network-Based Task Scheduling Algorithm in Edge Computing for Makespan Optimization 被引量:3
11
作者 Lei Zeng Qi Liu +1 位作者 Shigen Shen Xiaodong Liu 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第3期806-817,共12页
Edge computing nodes undertake an increasing number of tasks with the rise of business density.Therefore,how to efficiently allocate large-scale and dynamic workloads to edge computing resources has become a critical ... Edge computing nodes undertake an increasing number of tasks with the rise of business density.Therefore,how to efficiently allocate large-scale and dynamic workloads to edge computing resources has become a critical challenge.This study proposes an edge task scheduling approach based on an improved Double Deep Q Network(DQN),which is adopted to separate the calculations of target Q values and the selection of the action in two networks.A new reward function is designed,and a control unit is added to the experience replay unit of the agent.The management of experience data are also modified to fully utilize its value and improve learning efficiency.Reinforcement learning agents usually learn from an ignorant state,which is inefficient.As such,this study proposes a novel particle swarm optimization algorithm with an improved fitness function,which can generate optimal solutions for task scheduling.These optimized solutions are provided for the agent to pre-train network parameters to obtain a better cognition level.The proposed algorithm is compared with six other methods in simulation experiments.Results show that the proposed algorithm outperforms other benchmark methods regarding makespan. 展开更多
关键词 edge computing task scheduling reinforcement learning MAKESPAN Double deep q Network(DqN)
原文传递
Real-time UAV path planning based on LSTM network 被引量:2
12
作者 ZHANG Jiandong GUO Yukun +3 位作者 ZHENG Lihui YANG Qiming SHI Guoqing WU Yong 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第2期374-385,共12页
To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on... To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning. 展开更多
关键词 deep q network path planning neural network unmanned aerial vehicle(UAV) long short-term memory(LSTM)
在线阅读 下载PDF
基于深度学习的水风光短期随机优化调度研究 被引量:2
13
作者 张一凡 《水电与新能源》 2024年第3期34-37,共4页
我国致力于可再生能源发展,提出水-风-光多能互补系统,因风光能源的不确定性,需实时电网调度调整。文章运用深度学习(DQN)优化系统的短期调度,最大化发电效益。采用拉丁超立方抽样和考虑Kantorovich距离的场景削减技术,反映可再生能源... 我国致力于可再生能源发展,提出水-风-光多能互补系统,因风光能源的不确定性,需实时电网调度调整。文章运用深度学习(DQN)优化系统的短期调度,最大化发电效益。采用拉丁超立方抽样和考虑Kantorovich距离的场景削减技术,反映可再生能源不确定性分布,结合深度强化学习建立多能互补系统短期优化调度模型。模拟实际数据,显示该方法有效解决高维等问题,较于传统方法有显著优势。 展开更多
关键词 短期调度 不确定性 拉丁超立方抽样 场景削减 deep q Network
在线阅读 下载PDF
Situational continuity-based air combat autonomous maneuvering decision-making 被引量:5
14
作者 Jian-dong Zhang Yi-fei Yu +3 位作者 Li-hui Zheng Qi-ming Yang Guo-qing Shi Yong Wu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第11期66-79,共14页
In order to improve the performance of UAV's autonomous maneuvering decision-making,this paper proposes a decision-making method based on situational continuity.The algorithm in this paper designs a situation eval... In order to improve the performance of UAV's autonomous maneuvering decision-making,this paper proposes a decision-making method based on situational continuity.The algorithm in this paper designs a situation evaluation function with strong guidance,then trains the Long Short-Term Memory(LSTM)under the framework of Deep Q Network(DQN)for air combat maneuvering decision-making.Considering the continuity between adjacent situations,the method takes multiple consecutive situations as one input of the neural network.To reflect the difference between adjacent situations,the method takes the difference of situation evaluation value as the reward of reinforcement learning.In different scenarios,the algorithm proposed in this paper is compared with the algorithm based on the Fully Neural Network(FNN)and the algorithm based on statistical principles respectively.The results show that,compared with the FNN algorithm,the algorithm proposed in this paper is more accurate and forwardlooking.Compared with the algorithm based on the statistical principles,the decision-making of the algorithm proposed in this paper is more efficient and its real-time performance is better. 展开更多
关键词 UAV Maneuvering decision-making Situational continuity Long short-term memory(LSTM) deep q network(DqN) Fully neural network(FNN)
在线阅读 下载PDF
基于深度强化学习的固高直线一级倒立摆控制实验设计
15
作者 冯肖雪 谢天 +1 位作者 温岳 李位星 《科技资讯》 2023年第23期4-10,共7页
为适应各高校人工智能专业学生对于机器学习领域的学习需求,同时兼顾固高科技直线一级倒立摆控制系统可操作性、实时性和安全性,设计了一套基于深度强化学习的固高直线一级倒立摆控制实验方案。首先采用深度强化学习算法的无模型控制结... 为适应各高校人工智能专业学生对于机器学习领域的学习需求,同时兼顾固高科技直线一级倒立摆控制系统可操作性、实时性和安全性,设计了一套基于深度强化学习的固高直线一级倒立摆控制实验方案。首先采用深度强化学习算法的无模型控制结构搭建控制器并进行虚拟仿真实验。考虑倒立摆电机驱动刷新频率的限制以及提高样本处理速度,进一步设计了基于离线Q学习算法的平衡控制器实现倒立摆实物稳定控制。该实验方案既加深了学生对人工智能领域知识的理解,也适应了固高科技直线一级倒立摆的应用场景。 展开更多
关键词 直线一级倒立摆 深度强化学习 deep q Network算法 q学习算法
在线阅读 下载PDF
Research on intelligent fault diagnosis for railway point machines using deep reinforcement learning 被引量:1
16
作者 Shuai Xiao Qingsheng Feng +1 位作者 Xue Li Hong Li 《Transportation Safety and Environment》 2024年第4期75-86,共12页
The advanced diagnosis of faults in railway point machines is crucial for ensuring the smooth operation of the turnout conversion system and the safe functioning of trains.Signal processing and deep learning-based met... The advanced diagnosis of faults in railway point machines is crucial for ensuring the smooth operation of the turnout conversion system and the safe functioning of trains.Signal processing and deep learning-based methods have been extensively explored in the realm of fault diagnosis.While these approaches effectively extract fault features and facilitate the creation of end-to-end diagnostic models,they often demand considerable expert experience and manual intervention in feature selection,structural construction and parameter optimization of neural networks.This reliance on manual efforts can result in weak generalization performance and a lack of intelligence in the model.To address these challenges,this study introduces an intelligent fault diagnosis method based on deep reinforcement learning(DRL).Initially,a one-dimensional convolutional neural network agent is established,leveraging the specific characteristics of point machine fault data to automatically extract diverse features across multiple scales.Subsequently,deep Q network is incorporated as the central component of the diagnostic framework.The fault classification interactive environment is meticulously designed,and the agent training network is optimized.Through extensive interaction between the agent and the environment using fault data,satisfactory cumulative rewards and effective fault classification strategies are achieved.Experimental results demonstrate the proposed method’s high efficacy,with a training accuracy of 98.9%and a commendable test accuracy of 98.41%.Notably,the utilization of DRL in addressing the fault diagnosis challenge for railway point machines enhances the intelligence of diagnostic process,particularly through its excellent independent exploration capability. 展开更多
关键词 fault diagnosis railway point machines one-dimensional convolutional neural network deep q network algorithm
在线阅读 下载PDF
Fault Identification in Power Network Based on Deep Reinforcement Learning 被引量:6
17
作者 Mengshi Li Huanming Zhang +1 位作者 Tianyao Ji Q.H.Wu 《CSEE Journal of Power and Energy Systems》 SCIE EI CSCD 2022年第3期721-731,共11页
With the integration of alternative energy and renewables,the issue of stability and resilience of the power network has received considerable attention.The basic necessity for fault diagnosis and isolation is fault i... With the integration of alternative energy and renewables,the issue of stability and resilience of the power network has received considerable attention.The basic necessity for fault diagnosis and isolation is fault identification and location.The conventional intelligent fault identification method needs supervision,manual labelling of characteristics,and requires large amounts of labelled data.To enhance the ability of intelligent methods and get rid of the dependence on a large amount of labelled data,a novel fault identification method based on deep reinforcement learning(DRL),which has not received enough attention in the field of fault identification,is investigated in this paper.The proposed method uses different faults as parameters of the model to expand the scope of fault identification.In addition,the DRL algorithm can intelligently modify the fault parameters according to the observations obtained from the power network environment,rather than requiring manual and mechanical tuning of parameters.The methodology was tested on the IEEE 14 bus for several scenarios and the performance of the proposed method was compared with that of population-based optimization methods and supervised learning methods.The obtained results have confirmed the feasibility and effectiveness of the proposed method. 展开更多
关键词 Artificial intelligence deep q network deep reinforcement learning fault diagnosis fault identification parameter identification power network
原文传递
Tactical conflict resolution in urban airspace for unmanned aerial vehicles operations using attention-based deep reinforcement learning 被引量:2
18
作者 Mingcheng Zhang Chao Yan +2 位作者 Wei Dai Xiaojia Xiang Kin Huat Low 《Green Energy and Intelligent Transportation》 2023年第4期43-57,共15页
Unmanned aerial vehicles(UAVs)have gained much attention from academic and industrial areas due to the significant number of potential applications in urban airspace.A traffic management system for these UAVs is neede... Unmanned aerial vehicles(UAVs)have gained much attention from academic and industrial areas due to the significant number of potential applications in urban airspace.A traffic management system for these UAVs is needed to manage this future traffic.Tactical conflict resolution for unmanned aerial systems(UASs)is an essential piece of the puzzle for the future UAS Traffic Management(UTM),especially in very low-level(VLL)urban airspace.Unlike conflict resolution in higher altitude airspace,the dense high-rise buildings are an essential source of potential conflict to be considered in VLL urban airspace.In this paper,we propose an attention-based deep reinforcement learning approach to solve the tactical conflict resolution problem.Specifically,we formulate this task as a sequential decision-making problem using Markov Decision Process(MDP).The double deep Q network(DDQN)framework is used as a learning framework for the host drone to learn to output conflict-free maneuvers at each time step.We use the attention mechanism to model the individual neighbor's effect on the host drone,endowing the learned conflict resolution policy to be adapted to an arbitrary number of neighboring drones.Lastly,we build a simulation environment with various scenarios covering different types of encounters to evaluate the proposed approach.The simulation results demonstrate that our proposed algorithm provides a reliable solution to minimize secondary conflict counts compared to learning and non-learning-based approaches under different traffic density scenarios. 展开更多
关键词 Unmanned aircraft system traffic management Tactical conflict resolution Double deep q network Attention mechanism Secondary conflict
原文传递
Reusable electronic products value prediction based on reinforcement learning 被引量:1
19
作者 DU YongPing JIN XingNan +1 位作者 HAN HongGui WANG LuLin 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2022年第7期1578-1586,共9页
With the appearance of a huge number of reusable electronic products,the precise value evaluation has become an urgent problem to be solved in the recycling process.Traditional methods rely on manual intervention most... With the appearance of a huge number of reusable electronic products,the precise value evaluation has become an urgent problem to be solved in the recycling process.Traditional methods rely on manual intervention mostly.In order to make the model more suitable for the dynamic updating,this paper proposes the reinforcement learning based electronic products value prediction model which integrates market information to achieve timely and stable prediction results.The basic attributes and depreciation attributes of the product are modeled by two parallel neural networks separately to learn the different effects for prediction.Most importantly,the double deep Q network is adopted to fuse market information by reinforcement learning strategy,and the training on the old product data can be used to predict the following appeared product,which alleviates the cold start problem.Experiments on the real mobile phone recycling platform data verify that the model has achieved higher accuracy and it has a better generalization ability. 展开更多
关键词 electronic products value prediction reinforcement learning market factor deep q network
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部