期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
Exploring crash induction strategies in within-visual-range air combat based on distributional reinforcement learning
1
作者 Zetian HU Xuefeng LIANG +2 位作者 Jun ZHANG Xiaochuan YOU Chengcheng MA 《Chinese Journal of Aeronautics》 2025年第9期350-364,共15页
Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventio... Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventional Reinforcement Learning(RL)algorithms,often focus on maximizing engagement outcomes through direct combat superiority.However,these methods overlook alternative tactics,such as inducing adversaries to crash,which can achieve decisive victories with lower risk and cost.This study proposes Alpha Crash,a novel distributional-rein forcement-learning-based agent specifically designed to defeat opponents by leveraging crash induction strategies.The approach integrates an improved QR-DQN framework to address uncertainties and adversarial tactics,incorporating advanced pilot experience into its reward functions.Extensive simulations reveal Alpha Crash's robust performance,achieving a 91.2%win rate across diverse scenarios by effectively guiding opponents into critical errors.Visualization and altitude analyses illustrate the agent's three-stage crash induction strategies that exploit adversaries'vulnerabilities.These findings underscore Alpha Crash's potential to enhance autonomous decision-making and strategic innovation in real-world air combat applications. 展开更多
关键词 Unmanned combat aerial vehicle Decision-making distributional reinforcement learning Within-visual-range air combat Crash induction strategy
原文传递
A new accelerating algorithm for multi-agent reinforcement learning 被引量:1
2
作者 张汝波 仲宇 顾国昌 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2005年第1期48-51,共4页
In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents’ behaviors. However, joint-action reinforcement learni... In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents’ behaviors. However, joint-action reinforcement learning algorithms suffer the slow convergence rate because of the enormous learning space produced by joint-action. In this article, a prediction-based reinforcement learning algorithm is presented for multi-agent cooperation tasks, which demands all agents to learn predicting the probabilities of actions that other agents may execute. A multi-robot cooperation experiment is run to test the efficacy of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation policy much faster than the primitive reinforcement learning algorithm. 展开更多
关键词 distributed reinforcement learning accelerating algorithm machine learning multi-agent system
在线阅读 下载PDF
Autonomous Vehicle Platoons In Urban Road Networks:A Joint Distributed Reinforcement Learning and Model Predictive Control Approach
3
作者 Luigi D’Alfonso Francesco Giannini +3 位作者 Giuseppe Franzè Giuseppe Fedele Francesco Pupo Giancarlo Fortino 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第1期141-156,共16页
In this paper, platoons of autonomous vehicles operating in urban road networks are considered. From a methodological point of view, the problem of interest consists of formally characterizing vehicle state trajectory... In this paper, platoons of autonomous vehicles operating in urban road networks are considered. From a methodological point of view, the problem of interest consists of formally characterizing vehicle state trajectory tubes by means of routing decisions complying with traffic congestion criteria. To this end, a novel distributed control architecture is conceived by taking advantage of two methodologies: deep reinforcement learning and model predictive control. On one hand, the routing decisions are obtained by using a distributed reinforcement learning algorithm that exploits available traffic data at each road junction. On the other hand, a bank of model predictive controllers is in charge of computing the more adequate control action for each involved vehicle. Such tasks are here combined into a single framework:the deep reinforcement learning output(action) is translated into a set-point to be tracked by the model predictive controller;conversely, the current vehicle position, resulting from the application of the control move, is exploited by the deep reinforcement learning unit for improving its reliability. The main novelty of the proposed solution lies in its hybrid nature: on one hand it fully exploits deep reinforcement learning capabilities for decisionmaking purposes;on the other hand, time-varying hard constraints are always satisfied during the dynamical platoon evolution imposed by the computed routing decisions. To efficiently evaluate the performance of the proposed control architecture, a co-design procedure, involving the SUMO and MATLAB platforms, is implemented so that complex operating environments can be used, and the information coming from road maps(links,junctions, obstacles, semaphores, etc.) and vehicle state trajectories can be shared and exchanged. Finally by considering as operating scenario a real entire city block and a platoon of eleven vehicles described by double-integrator models, several simulations have been performed with the aim to put in light the main f eatures of the proposed approach. Moreover, it is important to underline that in different operating scenarios the proposed reinforcement learning scheme is capable of significantly reducing traffic congestion phenomena when compared with well-reputed competitors. 展开更多
关键词 Distributed model predictive control distributed reinforcement learning routing decisions urban road networks
在线阅读 下载PDF
Active Power Correction Strategies Based on Deep Reinforcement Learning Part II:A Distributed Solution for Adaptability 被引量:3
4
作者 Siyuan Jiajun Duan +5 位作者 Yuyang Bai Jun Zhang Di Shi Zhiwei Wang Xuzhu Dong Yuanzhang Sun 《CSEE Journal of Power and Energy Systems》 SCIE EI CSCD 2022年第4期1134-1144,共11页
This article is the second part of Active Power Correction Strategies Based on Deep Reinforcement Learning.In Part II,we consider the renewable energy scenarios plugged into the large-scale power grid and provide an a... This article is the second part of Active Power Correction Strategies Based on Deep Reinforcement Learning.In Part II,we consider the renewable energy scenarios plugged into the large-scale power grid and provide an adaptive algorithmic implementation to maintain power grid stability.Based on the robustness method in Part I,a distributed deep reinforcement learning method is proposed to overcome the infuence of the increasing renewable energy penetration.A multi-agent system is implemented in multiple control areas of the power system,which conducts a fully cooperative stochastic game.Based on the Monte Carlo tree search mentioned in Part I,we select practical actions in each sub-control area to search the Nash equilibrium of the game.Based on the QMIX method,a structure of offine centralized training and online distributed execution is proposed to employ better practical actions in the active power correction control.Our proposed method is evaluated in the modified global competition scenario cases of“2020 Learning to Run a Power Network.Neurips Track 2”. 展开更多
关键词 Active power correction strategies distributed deep reinforcement learning Nash equilibrium renewable energies stochastic game
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部