Cooperative autonomous air combat of multiple unmanned aerial vehicles(UAVs)is one of the main combat modes in future air warfare,which becomes even more complicated with highly changeable situation and uncertain info...Cooperative autonomous air combat of multiple unmanned aerial vehicles(UAVs)is one of the main combat modes in future air warfare,which becomes even more complicated with highly changeable situation and uncertain information of the opponents.As such,this paper presents a cooperative decision-making method based on incomplete information dynamic game to generate maneuver strategies for multiple UAVs in air combat.Firstly,a cooperative situation assessment model is presented to measure the overall combat situation.Secondly,an incomplete information dynamic game model is proposed to model the dynamic process of air combat,and a dynamic Bayesian network is designed to infer the tactical intention of the opponent.Then a reinforcement learning framework based on multiagent deep deterministic policy gradient is established to obtain the perfect Bayes-Nash equilibrium solution of the air combat game model.Finally,a series of simulations are conducted to verify the effectiveness of the proposed method,and the simulation results show effective synergies and cooperative tactics.展开更多
Unmanned aerial vehicles(UAVs)are widely used in situations with uncertain and risky areas lacking network coverage.In natural disasters,timely delivery of first aid supplies is crucial.Current UAVs face risks such as...Unmanned aerial vehicles(UAVs)are widely used in situations with uncertain and risky areas lacking network coverage.In natural disasters,timely delivery of first aid supplies is crucial.Current UAVs face risks such as crashing into birds or unexpected structures.Airdrop systems with parachutes risk dispersing payloads away from target locations.The objective here is to use multiple UAVs to distribute payloads cooperatively to assigned locations.The civil defense department must balance coverage,accurate landing,and flight safety while considering battery power and capability.Deep Q-network(DQN)models are commonly used in multi-UAV path planning to effectively represent the surroundings and action spaces.Earlier strategies focused on advanced DQNs for UAV path planning in different configurations,but rarely addressed non-cooperative scenarios and disaster environments.This paper introduces a new DQN framework to tackle challenges in disaster environments.It considers unforeseen structures and birds that could cause UAV crashes and assumes urgent landing zones and winch-based airdrop systems for precise delivery and return.A new DQN model is developed,which incorporates the battery life,safe flying distance between UAVs,and remaining delivery points to encode surrounding hazards into the state space and Q-networks.Additionally,a unique reward system is created to improve UAV action sequences for better delivery coverage and safe landings.The experimental results demonstrate that multi-UAV first aid delivery in disaster environments can achieve advanced performance.展开更多
Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventio...Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventional Reinforcement Learning(RL)algorithms,often focus on maximizing engagement outcomes through direct combat superiority.However,these methods overlook alternative tactics,such as inducing adversaries to crash,which can achieve decisive victories with lower risk and cost.This study proposes Alpha Crash,a novel distributional-rein forcement-learning-based agent specifically designed to defeat opponents by leveraging crash induction strategies.The approach integrates an improved QR-DQN framework to address uncertainties and adversarial tactics,incorporating advanced pilot experience into its reward functions.Extensive simulations reveal Alpha Crash's robust performance,achieving a 91.2%win rate across diverse scenarios by effectively guiding opponents into critical errors.Visualization and altitude analyses illustrate the agent's three-stage crash induction strategies that exploit adversaries'vulnerabilities.These findings underscore Alpha Crash's potential to enhance autonomous decision-making and strategic innovation in real-world air combat applications.展开更多
Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes...Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes a novel Dominant and Non-dominant strategy sample selection(DoNot)mechanism and a Local Observation Enhanced Multi-Agent Proximal Policy Optimization(LOE-MAPPO)algorithm to train the multi-UCAV air combat policy and improve its generalization.Specifically,the LOE-MAPPO algorithm adopts a mixed state that concatenates the global state and individual agent's local observation to enable efficient value function learning in multi-UCAV air combat.The DoNot mechanism classifies opponents into dominant or non-dominant strategy opponents,and samples from easier to more challenging opponents to form an adaptive training curriculum.Empirical results demonstrate that the proposed LOE-MAPPO algorithm outperforms baseline MARL algorithms in multi-UCAV air combat scenarios,and the DoNot mechanism leads to stronger policy generalization when facing diverse opponents.The results pave the way for the fast generation of cooperative strategies for air combat agents with MARLalgorithms.展开更多
In multiple Unmanned Aerial Vehicles(UAV)systems,achieving efficient navigation is essential for executing complex tasks and enhancing autonomy.Traditional navigation methods depend on predefined control strategies an...In multiple Unmanned Aerial Vehicles(UAV)systems,achieving efficient navigation is essential for executing complex tasks and enhancing autonomy.Traditional navigation methods depend on predefined control strategies and trajectory planning and often perform poorly in complex environments.To improve the UAV-environment interaction efficiency,this study proposes a multi-UAV integrated navigation algorithm based on Deep Reinforcement Learning(DRL).This algorithm integrates the Inertial Navigation System(INS),Global Navigation Satellite System(GNSS),and Visual Navigation System(VNS)for comprehensive information fusion.Specifically,an improved multi-UAV integrated navigation algorithm called Information Fusion with MultiAgent Deep Deterministic Policy Gradient(IF-MADDPG)was developed.This algorithm enables UAVs to learn collaboratively and optimize their flight trajectories in real time.Through simulations and experiments,test scenarios in GNSS-denied environments were constructed to evaluate the effectiveness of the algorithm.The experimental results demonstrate that the IF-MADDPG algorithm significantly enhances the collaborative navigation capabilities of multiple UAVs in formation maintenance and GNSS-denied environments.Additionally,it has advantages in terms of mission completion time.This study provides a novel approach for efficient collaboration in multi-UAV systems,which significantly improves the robustness and adaptability of navigation systems.展开更多
The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),...The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),which can be abstracted as a heterogeneous combat network(HCN).It is of great military significance to study the disintegration strategy of combat networks to achieve the breakdown of the enemy’s CSoS.To this end,this paper proposes an integrated framework called HCN disintegration based on double deep Q-learning(HCN-DDQL).Firstly,the enemy’s CSoS is abstracted as an HCN,and an evaluation index based on the capability and attack costs of nodes is proposed.Meanwhile,a mathematical optimization model for HCN disintegration is established.Secondly,the learning environment and double deep Q-network model of HCN-DDQL are established to train the HCN’s disintegration strategy.Then,based on the learned HCN-DDQL model,an algorithm for calculating the HCN’s optimal disintegration strategy under different states is proposed.Finally,a case study is used to demonstrate the reliability and effectiveness of HCNDDQL,and the results demonstrate that HCN-DDQL can disintegrate HCNs more effectively than baseline methods.展开更多
During its interaction with modern sports,traditional Wushu has faced increasing doubts about its combat effectiveness,raising concerns about its cultural identity.How traditional Wushu is understood as a combat art n...During its interaction with modern sports,traditional Wushu has faced increasing doubts about its combat effectiveness,raising concerns about its cultural identity.How traditional Wushu is understood as a combat art not only helps define its cultural essence but also carries important implications for its long-term development.It is an objective fact that combat represents the practical manifestation of traditional Wushu in history.Combat reflects similarities among traditional Wushu forms that emerged throughout history.Combat reflects the historical law governing the evolution of traditional Wushu and represents an abstraction of repetitive phenomena in traditional Wushu.A correct understanding of this objectivity,these similarities,and this repeatability is conducive to promoting and carrying forward traditional Wushu,thereby facilitating an objective analysis of differences among different traditional Wushu forms and the discovery of their evolution paradigm.In the contemporary context,it is essential for traditional Wushu to emphasize its distinctive cultural roots,thereby facilitating creative transformation and innovative development.展开更多
Aiming at the problem of low convergence efficiency of traditional multi-UAV path planning algorithms in unknown complex environments,this paper proposes a deep reinforcement learning algorithm incorporating the atten...Aiming at the problem of low convergence efficiency of traditional multi-UAV path planning algorithms in unknown complex environments,this paper proposes a deep reinforcement learning algorithm incorporating the attention mechanism.The method is based on the Soft Actor-Critic(SAC)framework,which introduces a multi-attention mechanism in the Critic network,dynamically learns the dependency relationship between intelligences,and realizes key information screening and conflict avoidance.An environment with multiple random obstacles is designed to simulate complex emergent situations.The results show that the proposed algorithm significantly improves the mission success rate and average reward,significantly extends the survival time and exploration range of the UAVs,and verifies the effectiveness of the attention mechanism in enhancing the efficiency,robustness,and long-term planning capability of multi-UAV collaboration,as compared to the baseline method that does not use attention.展开更多
This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing...This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing in the cooperative pursuit tasks of multiple unmanned aerial vehicles(UAVs).Traditional algorithms often fail to effectively identify critical cooperative relationships in such tasks,leading to low capture efficiency and a significant decline in performance when the scale expands.To tackle these issues,based on the proximal policy optimization(PPO)algorithm,MA2PPO adopts the centralized training with decentralized execution(CTDE)framework and introduces a dynamic decoupling mechanism,that is,sharing the multi-head attention(MHA)mechanism for critics during centralized training to solve the credit assignment problem.This method enables the pursuers to identify highly correlated interactions with their teammates,effectively eliminate irrelevant and weakly relevant interactions,and decompose large-scale cooperation problems into decoupled sub-problems,thereby enhancing the collaborative efficiency and policy stability among multiple agents.Furthermore,a reward function has been devised to facilitate the pursuers to encircle the escapee by combining a formation reward with a distance reward,which incentivizes UAVs to develop sophisticated cooperative pursuit strategies.Experimental results demonstrate the effectiveness of the proposed algorithm in achieving multi-UAV cooperative pursuit and inducing diverse cooperative pursuit behaviors among UAVs.Moreover,experiments on scalability have demonstrated that the algorithm is suitable for large-scale multi-UAV systems.展开更多
This study introduces a novel algorithm known as the dung beetle optimization algorithm based on bounded reflection optimization andmulti-strategy fusion(BFDBO),which is designed to tackle the complexities associated ...This study introduces a novel algorithm known as the dung beetle optimization algorithm based on bounded reflection optimization andmulti-strategy fusion(BFDBO),which is designed to tackle the complexities associated with multi-UAV collaborative trajectory planning in intricate battlefield environments.Initially,a collaborative planning cost function for the multi-UAV system is formulated,thereby converting the trajectory planning challenge into an optimization problem.Building on the foundational dung beetle optimization(DBO)algorithm,BFDBO incorporates three significant innovations:a boundary reflection mechanism,an adaptive mixed exploration strategy,and a dynamic multi-scale mutation strategy.These enhancements are intended to optimize the equilibrium between local exploration and global exploitation,facilitating the discovery of globally optimal trajectories thatminimize the cost function.Numerical simulations utilizing the CEC2022 benchmark function indicate that all three enhancements of BFDBOpositively influence its performance,resulting in accelerated convergence and improved optimization accuracy relative to leading optimization algorithms.In two battlefield scenarios of varying complexities,BFDBO achieved a minimum of a 39% reduction in total trajectory planning costs when compared to DBO and three other highperformance variants,while also demonstrating superior average runtime.This evidence underscores the effectiveness and applicability of BFDBO in practical,real-world contexts.展开更多
The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in th...The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in this domain,the existing methods often lack strategic depth and generalization in complex,high-dimensional environments.To address these limitations,this paper proposes an optimized self-play method enhanced by advancements in fighter modeling,neural network design,and algorithmic frameworks.This study employs a six-degree-of-freedom(6-DOF)F-16 fighter model based on open-source aerodynamic data,featuring airborne equipment and a realistic visual simulation platform,unlike traditional 3-DOF models.To capture temporal dynamics,Long Short-Term Memory(LSTM)layers are integrated into the neural network,complemented by delayed input stacking.The RL environment incorporates expert strategies,curiositydriven rewards,and curriculum learning to improve adaptability and strategic decision-making.Experimental results demonstrate that the proposed approach achieves a winning rate exceeding90%against classical single-agent methods.Additionally,through enhanced 3D visual platforms,we conducted human-agent confrontation experiments,where the agent attained an average winning rate of over 75%.The agent's maneuver trajectories closely align with human pilot strategies,showcasing its potential in decision-making and pilot training applications.This study highlights the effectiveness of integrating advanced modeling and self-play techniques in developing robust air combat decision-making systems.展开更多
Multiple UAVs cooperative target search has been widely used in various environments,such as emergency rescue and traffic monitoring.However,uncertain communication network among UAVs exhibits unstable links and rapid...Multiple UAVs cooperative target search has been widely used in various environments,such as emergency rescue and traffic monitoring.However,uncertain communication network among UAVs exhibits unstable links and rapid topological fluctuations due to mission complexity and unpredictable environmental states.This limitation hinders timely information sharing and insightful path decisions for UAVs,resulting in inefficient or even failed collaborative search.Aiming at this issue,this paper proposes a multi-UAV cooperative search strategy by developing a real-time trajectory decision that incorporates autonomous connectivity to reinforce multi-UAV collaboration and achieve search acceleration in uncertain search environments.Specifically,an autonomous connectivity strategy based on node cognitive information and network states is introduced to enable effective message transmission and adapt to the dynamic network environment.Based on the fused information,we formalize the trajectory planning as a multiobjective optimization problem by jointly considering search performance and UAV energy harnessing.A multi-agent deep reinforcement learning based algorithm is proposed to solve it,where the reward-guided real-time path is determined to achieve an energyefficient search.Finally,extensive experimental results show that the proposed algorithm outperforms existing works in terms of average search rate and coverage rate with reduced energy consumption under uncertain search environments.展开更多
To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartogra...To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartography of heterogeneous combat networks based on the operational chain”(FCBOC).In this framework,a functional module detection algorithm named operational chain-based label propagation algorithm(OCLPA),which considers the cooperation and interactions among combat entities and can thus naturally tackle network heterogeneity,is proposed to identify the functional modules of the network.Then,the nodes and their modules are classified into different roles according to their properties.A case study shows that FCBOC can provide a simplified description of disorderly information of combat networks and enable us to identify their functional and structural network characteristics.The results provide useful information to help commanders make precise and accurate decisions regarding the protection,disintegration or optimization of combat networks.Three algorithms are also compared with OCLPA to show that FCBOC can most effectively find functional modules with practical meaning.展开更多
Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its ...Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its failure to consider the intention and event of the target,resulting in inaccurate assessment results.In view of this,an integrated threat assessment method is proposed to address the existing problems,such as overly subjective determination of index weight and imbalance of situation.The process and characteristics of BVR air combat are analyzed to establish a threat assessment model in terms of target intention,event,situation,and capability.On this basis,a distributed weight-solving algorithm is proposed to determine index and attribute weight respectively.Then,variable weight and game theory are introduced to effectively deal with the situation imbalance and achieve the combination of subjective and objective.The performance of the model and algorithm is evaluated through multiple simulation experiments.The assessment results demonstrate the accuracy of the proposed method in BVR air combat,indicating its potential practical significance in real air combat scenarios.展开更多
Unmanned Aerial Vehicle(UAV) trajectory prediction is an important research topic in the field of UAV air combat. In order to address the problem of single-feature extraction scale and scene adaptability in UAV air co...Unmanned Aerial Vehicle(UAV) trajectory prediction is an important research topic in the field of UAV air combat. In order to address the problem of single-feature extraction scale and scene adaptability in UAV air combat trajectory prediction algorithms, this paper proposes an innovative UAV trajectory prediction method QCNet-3D, which can predict the future trajectory of the target UAV and provide the corresponding possibility. Firstly, the UAV trajectory prediction is modeled based on the mixture of Laplace distributions, and the UAV's kinetic equations are employed to construct the UAV trajectory prediction dataset(UAVTP dataset), ensuring high reliability. Secondly, two improvement methods are proposed on the basis of QCNet: multi-scale Fourier mapping and three-dimensional adaptation. The ablation study shows that the improvement methods have reduced the minimum average displacement error, minimum final displacement error, and missing rate by 55.4%, 54.3%, and 68.1% respectively. Finally, QCNet-3D is proposed based on the two improvement methods, and the simulation experiment confirm the proposed algorithm's capability to predict both simple and complex UAV maneuvers, offering the possibility for each predicted trajectory under various prediction future steps and output modes.展开更多
Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm opt...Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm optimization (HPSO) algorithm is proposed to solve the decision-making (DM) problem. HA facilitates to search the local optimum in the neighborhood of a solution, while the PSO algorithm tends to explore the search space for possible solutions. Combining the advantages of HA and PSO, HPSO algorithms can find out the global optimum quickly and efficiently. It obtains the DM solution by seeking for the optimal assignment of missiles of friendly fighter aircrafts (FAs) to hostile FAs. Simulation results show that the proposed algorithm is superior to the general PSO algorithm and two GA based algorithms in searching for the best solution to the DM problem.展开更多
The combat survivability is an essential factor to be considered in the development of recent military aircraft. Radar stealth and onboard electronic attack are two major techniques for the reduction of aircraft susce...The combat survivability is an essential factor to be considered in the development of recent military aircraft. Radar stealth and onboard electronic attack are two major techniques for the reduction of aircraft susceptibility. A tactical scenario for a strike mission is presented. The effect of aircraft radar cross section on the detection probability of a threat radar, as well as that of onboard jammer, are investigated. The guidance errors of radar guided surface to air missile and anti aircraft artillery, which are disturbed by radar cross section reduction or jammer radiated power and both of them are determined. The probability of aircraft kill given a single shot is calculated and finally the sortie survivability of an attack aircraft in a supposed hostile thread environment worked out. It is demonstrated that the survivability of a combat aircraft will be greatly enhanced by the combined radar stealth and onboard electronic attack, and the evaluation metho dology is effective and applicable.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.61933010 and 61903301)Shaanxi Aerospace Flight Vehicle Design Key Laboratory。
文摘Cooperative autonomous air combat of multiple unmanned aerial vehicles(UAVs)is one of the main combat modes in future air warfare,which becomes even more complicated with highly changeable situation and uncertain information of the opponents.As such,this paper presents a cooperative decision-making method based on incomplete information dynamic game to generate maneuver strategies for multiple UAVs in air combat.Firstly,a cooperative situation assessment model is presented to measure the overall combat situation.Secondly,an incomplete information dynamic game model is proposed to model the dynamic process of air combat,and a dynamic Bayesian network is designed to infer the tactical intention of the opponent.Then a reinforcement learning framework based on multiagent deep deterministic policy gradient is established to obtain the perfect Bayes-Nash equilibrium solution of the air combat game model.Finally,a series of simulations are conducted to verify the effectiveness of the proposed method,and the simulation results show effective synergies and cooperative tactics.
基金supported by the Committee of Science of the Ministry of Education and Science of the Republic of Kazakhstan under Grant No.249015/0224.
文摘Unmanned aerial vehicles(UAVs)are widely used in situations with uncertain and risky areas lacking network coverage.In natural disasters,timely delivery of first aid supplies is crucial.Current UAVs face risks such as crashing into birds or unexpected structures.Airdrop systems with parachutes risk dispersing payloads away from target locations.The objective here is to use multiple UAVs to distribute payloads cooperatively to assigned locations.The civil defense department must balance coverage,accurate landing,and flight safety while considering battery power and capability.Deep Q-network(DQN)models are commonly used in multi-UAV path planning to effectively represent the surroundings and action spaces.Earlier strategies focused on advanced DQNs for UAV path planning in different configurations,but rarely addressed non-cooperative scenarios and disaster environments.This paper introduces a new DQN framework to tackle challenges in disaster environments.It considers unforeseen structures and birds that could cause UAV crashes and assumes urgent landing zones and winch-based airdrop systems for precise delivery and return.A new DQN model is developed,which incorporates the battery life,safe flying distance between UAVs,and remaining delivery points to encode surrounding hazards into the state space and Q-networks.Additionally,a unique reward system is created to improve UAV action sequences for better delivery coverage and safe landings.The experimental results demonstrate that multi-UAV first aid delivery in disaster environments can achieve advanced performance.
基金supported by the National Key R&D Program of China(No.2021YFB3300602)。
文摘Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventional Reinforcement Learning(RL)algorithms,often focus on maximizing engagement outcomes through direct combat superiority.However,these methods overlook alternative tactics,such as inducing adversaries to crash,which can achieve decisive victories with lower risk and cost.This study proposes Alpha Crash,a novel distributional-rein forcement-learning-based agent specifically designed to defeat opponents by leveraging crash induction strategies.The approach integrates an improved QR-DQN framework to address uncertainties and adversarial tactics,incorporating advanced pilot experience into its reward functions.Extensive simulations reveal Alpha Crash's robust performance,achieving a 91.2%win rate across diverse scenarios by effectively guiding opponents into critical errors.Visualization and altitude analyses illustrate the agent's three-stage crash induction strategies that exploit adversaries'vulnerabilities.These findings underscore Alpha Crash's potential to enhance autonomous decision-making and strategic innovation in real-world air combat applications.
文摘Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes a novel Dominant and Non-dominant strategy sample selection(DoNot)mechanism and a Local Observation Enhanced Multi-Agent Proximal Policy Optimization(LOE-MAPPO)algorithm to train the multi-UCAV air combat policy and improve its generalization.Specifically,the LOE-MAPPO algorithm adopts a mixed state that concatenates the global state and individual agent's local observation to enable efficient value function learning in multi-UCAV air combat.The DoNot mechanism classifies opponents into dominant or non-dominant strategy opponents,and samples from easier to more challenging opponents to form an adaptive training curriculum.Empirical results demonstrate that the proposed LOE-MAPPO algorithm outperforms baseline MARL algorithms in multi-UCAV air combat scenarios,and the DoNot mechanism leads to stronger policy generalization when facing diverse opponents.The results pave the way for the fast generation of cooperative strategies for air combat agents with MARLalgorithms.
基金co-supported by the National Natural Science Foundation of China(Nos.92371201 and 52192633)the Natural Science Foundation of Shaanxi Province of China(No.2022JC-03)the Aeronautical Science Foundation of China(No.ASFC-20220019070002)。
文摘In multiple Unmanned Aerial Vehicles(UAV)systems,achieving efficient navigation is essential for executing complex tasks and enhancing autonomy.Traditional navigation methods depend on predefined control strategies and trajectory planning and often perform poorly in complex environments.To improve the UAV-environment interaction efficiency,this study proposes a multi-UAV integrated navigation algorithm based on Deep Reinforcement Learning(DRL).This algorithm integrates the Inertial Navigation System(INS),Global Navigation Satellite System(GNSS),and Visual Navigation System(VNS)for comprehensive information fusion.Specifically,an improved multi-UAV integrated navigation algorithm called Information Fusion with MultiAgent Deep Deterministic Policy Gradient(IF-MADDPG)was developed.This algorithm enables UAVs to learn collaboratively and optimize their flight trajectories in real time.Through simulations and experiments,test scenarios in GNSS-denied environments were constructed to evaluate the effectiveness of the algorithm.The experimental results demonstrate that the IF-MADDPG algorithm significantly enhances the collaborative navigation capabilities of multiple UAVs in formation maintenance and GNSS-denied environments.Additionally,it has advantages in terms of mission completion time.This study provides a novel approach for efficient collaboration in multi-UAV systems,which significantly improves the robustness and adaptability of navigation systems.
基金supported by the National Natural Science Foundation of China(7200120972231011+2 种基金72071206)the Science and Technology Innovative Research Team in Higher Educational Institutions of Hunan Province(2020RC4046)the Science Foundation for Outstanding Youth Scholars of Hunan Province(2022JJ20047).
文摘The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),which can be abstracted as a heterogeneous combat network(HCN).It is of great military significance to study the disintegration strategy of combat networks to achieve the breakdown of the enemy’s CSoS.To this end,this paper proposes an integrated framework called HCN disintegration based on double deep Q-learning(HCN-DDQL).Firstly,the enemy’s CSoS is abstracted as an HCN,and an evaluation index based on the capability and attack costs of nodes is proposed.Meanwhile,a mathematical optimization model for HCN disintegration is established.Secondly,the learning environment and double deep Q-network model of HCN-DDQL are established to train the HCN’s disintegration strategy.Then,based on the learned HCN-DDQL model,an algorithm for calculating the HCN’s optimal disintegration strategy under different states is proposed.Finally,a case study is used to demonstrate the reliability and effectiveness of HCNDDQL,and the results demonstrate that HCN-DDQL can disintegrate HCNs more effectively than baseline methods.
文摘During its interaction with modern sports,traditional Wushu has faced increasing doubts about its combat effectiveness,raising concerns about its cultural identity.How traditional Wushu is understood as a combat art not only helps define its cultural essence but also carries important implications for its long-term development.It is an objective fact that combat represents the practical manifestation of traditional Wushu in history.Combat reflects similarities among traditional Wushu forms that emerged throughout history.Combat reflects the historical law governing the evolution of traditional Wushu and represents an abstraction of repetitive phenomena in traditional Wushu.A correct understanding of this objectivity,these similarities,and this repeatability is conducive to promoting and carrying forward traditional Wushu,thereby facilitating an objective analysis of differences among different traditional Wushu forms and the discovery of their evolution paradigm.In the contemporary context,it is essential for traditional Wushu to emphasize its distinctive cultural roots,thereby facilitating creative transformation and innovative development.
文摘Aiming at the problem of low convergence efficiency of traditional multi-UAV path planning algorithms in unknown complex environments,this paper proposes a deep reinforcement learning algorithm incorporating the attention mechanism.The method is based on the Soft Actor-Critic(SAC)framework,which introduces a multi-attention mechanism in the Critic network,dynamically learns the dependency relationship between intelligences,and realizes key information screening and conflict avoidance.An environment with multiple random obstacles is designed to simulate complex emergent situations.The results show that the proposed algorithm significantly improves the mission success rate and average reward,significantly extends the survival time and exploration range of the UAVs,and verifies the effectiveness of the attention mechanism in enhancing the efficiency,robustness,and long-term planning capability of multi-UAV collaboration,as compared to the baseline method that does not use attention.
基金supported by the National Research and Development Program of China under Grant JCKY2018607C019in part by the Key Laboratory Fund of UAV of Northwestern Polytechnical University under Grant 2021JCJQLB0710L.
文摘This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing in the cooperative pursuit tasks of multiple unmanned aerial vehicles(UAVs).Traditional algorithms often fail to effectively identify critical cooperative relationships in such tasks,leading to low capture efficiency and a significant decline in performance when the scale expands.To tackle these issues,based on the proximal policy optimization(PPO)algorithm,MA2PPO adopts the centralized training with decentralized execution(CTDE)framework and introduces a dynamic decoupling mechanism,that is,sharing the multi-head attention(MHA)mechanism for critics during centralized training to solve the credit assignment problem.This method enables the pursuers to identify highly correlated interactions with their teammates,effectively eliminate irrelevant and weakly relevant interactions,and decompose large-scale cooperation problems into decoupled sub-problems,thereby enhancing the collaborative efficiency and policy stability among multiple agents.Furthermore,a reward function has been devised to facilitate the pursuers to encircle the escapee by combining a formation reward with a distance reward,which incentivizes UAVs to develop sophisticated cooperative pursuit strategies.Experimental results demonstrate the effectiveness of the proposed algorithm in achieving multi-UAV cooperative pursuit and inducing diverse cooperative pursuit behaviors among UAVs.Moreover,experiments on scalability have demonstrated that the algorithm is suitable for large-scale multi-UAV systems.
基金funded by the National Defense Science and Technology Innovation project,grant number ZZKY20223103the Basic Frontier InnovationProject at the Engineering University of PAP,grant number WJY202429+2 种基金the Basic Frontier lnnovation Project at the Engineering University of PAP,grant number WJY202408the Graduate Student Funding Priority Project,grant number JYWJ2024B006Key project of National Social Science Foundation,grant number 2023-SKJJ-A-116.
文摘This study introduces a novel algorithm known as the dung beetle optimization algorithm based on bounded reflection optimization andmulti-strategy fusion(BFDBO),which is designed to tackle the complexities associated with multi-UAV collaborative trajectory planning in intricate battlefield environments.Initially,a collaborative planning cost function for the multi-UAV system is formulated,thereby converting the trajectory planning challenge into an optimization problem.Building on the foundational dung beetle optimization(DBO)algorithm,BFDBO incorporates three significant innovations:a boundary reflection mechanism,an adaptive mixed exploration strategy,and a dynamic multi-scale mutation strategy.These enhancements are intended to optimize the equilibrium between local exploration and global exploitation,facilitating the discovery of globally optimal trajectories thatminimize the cost function.Numerical simulations utilizing the CEC2022 benchmark function indicate that all three enhancements of BFDBOpositively influence its performance,resulting in accelerated convergence and improved optimization accuracy relative to leading optimization algorithms.In two battlefield scenarios of varying complexities,BFDBO achieved a minimum of a 39% reduction in total trajectory planning costs when compared to DBO and three other highperformance variants,while also demonstrating superior average runtime.This evidence underscores the effectiveness and applicability of BFDBO in practical,real-world contexts.
基金co-supported by the National Natural Science Foundation of China(No.91852115)。
文摘The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in this domain,the existing methods often lack strategic depth and generalization in complex,high-dimensional environments.To address these limitations,this paper proposes an optimized self-play method enhanced by advancements in fighter modeling,neural network design,and algorithmic frameworks.This study employs a six-degree-of-freedom(6-DOF)F-16 fighter model based on open-source aerodynamic data,featuring airborne equipment and a realistic visual simulation platform,unlike traditional 3-DOF models.To capture temporal dynamics,Long Short-Term Memory(LSTM)layers are integrated into the neural network,complemented by delayed input stacking.The RL environment incorporates expert strategies,curiositydriven rewards,and curriculum learning to improve adaptability and strategic decision-making.Experimental results demonstrate that the proposed approach achieves a winning rate exceeding90%against classical single-agent methods.Additionally,through enhanced 3D visual platforms,we conducted human-agent confrontation experiments,where the agent attained an average winning rate of over 75%.The agent's maneuver trajectories closely align with human pilot strategies,showcasing its potential in decision-making and pilot training applications.This study highlights the effectiveness of integrating advanced modeling and self-play techniques in developing robust air combat decision-making systems.
基金supported by National Natural Science Foundation of China(No.62202449 and No.62472410)National Key Research and Development Program of China(2021YFB2900102)。
文摘Multiple UAVs cooperative target search has been widely used in various environments,such as emergency rescue and traffic monitoring.However,uncertain communication network among UAVs exhibits unstable links and rapid topological fluctuations due to mission complexity and unpredictable environmental states.This limitation hinders timely information sharing and insightful path decisions for UAVs,resulting in inefficient or even failed collaborative search.Aiming at this issue,this paper proposes a multi-UAV cooperative search strategy by developing a real-time trajectory decision that incorporates autonomous connectivity to reinforce multi-UAV collaboration and achieve search acceleration in uncertain search environments.Specifically,an autonomous connectivity strategy based on node cognitive information and network states is introduced to enable effective message transmission and adapt to the dynamic network environment.Based on the fused information,we formalize the trajectory planning as a multiobjective optimization problem by jointly considering search performance and UAV energy harnessing.A multi-agent deep reinforcement learning based algorithm is proposed to solve it,where the reward-guided real-time path is determined to achieve an energyefficient search.Finally,extensive experimental results show that the proposed algorithm outperforms existing works in terms of average search rate and coverage rate with reduced energy consumption under uncertain search environments.
文摘To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartography of heterogeneous combat networks based on the operational chain”(FCBOC).In this framework,a functional module detection algorithm named operational chain-based label propagation algorithm(OCLPA),which considers the cooperation and interactions among combat entities and can thus naturally tackle network heterogeneity,is proposed to identify the functional modules of the network.Then,the nodes and their modules are classified into different roles according to their properties.A case study shows that FCBOC can provide a simplified description of disorderly information of combat networks and enable us to identify their functional and structural network characteristics.The results provide useful information to help commanders make precise and accurate decisions regarding the protection,disintegration or optimization of combat networks.Three algorithms are also compared with OCLPA to show that FCBOC can most effectively find functional modules with practical meaning.
基金National Natural Science Foundation of China(62006193,62103338)Aeronautical Science Foundation of China(2022Z023053001)+1 种基金Key Research and Development Program of Shaanxi Province(2024GX-YBXM-115)Fundamental Research Funds for the Central Universities(D5000230150)。
文摘Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its failure to consider the intention and event of the target,resulting in inaccurate assessment results.In view of this,an integrated threat assessment method is proposed to address the existing problems,such as overly subjective determination of index weight and imbalance of situation.The process and characteristics of BVR air combat are analyzed to establish a threat assessment model in terms of target intention,event,situation,and capability.On this basis,a distributed weight-solving algorithm is proposed to determine index and attribute weight respectively.Then,variable weight and game theory are introduced to effectively deal with the situation imbalance and achieve the combination of subjective and objective.The performance of the model and algorithm is evaluated through multiple simulation experiments.The assessment results demonstrate the accuracy of the proposed method in BVR air combat,indicating its potential practical significance in real air combat scenarios.
基金National Natural Science Foundation (NSF) of China (No.61976014)the Aeronautical Science Foundation of China (2022Z071051001)。
文摘Unmanned Aerial Vehicle(UAV) trajectory prediction is an important research topic in the field of UAV air combat. In order to address the problem of single-feature extraction scale and scene adaptability in UAV air combat trajectory prediction algorithms, this paper proposes an innovative UAV trajectory prediction method QCNet-3D, which can predict the future trajectory of the target UAV and provide the corresponding possibility. Firstly, the UAV trajectory prediction is modeled based on the mixture of Laplace distributions, and the UAV's kinetic equations are employed to construct the UAV trajectory prediction dataset(UAVTP dataset), ensuring high reliability. Secondly, two improvement methods are proposed on the basis of QCNet: multi-scale Fourier mapping and three-dimensional adaptation. The ablation study shows that the improvement methods have reduced the minimum average displacement error, minimum final displacement error, and missing rate by 55.4%, 54.3%, and 68.1% respectively. Finally, QCNet-3D is proposed based on the two improvement methods, and the simulation experiment confirm the proposed algorithm's capability to predict both simple and complex UAV maneuvers, offering the possibility for each predicted trajectory under various prediction future steps and output modes.
文摘Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm optimization (HPSO) algorithm is proposed to solve the decision-making (DM) problem. HA facilitates to search the local optimum in the neighborhood of a solution, while the PSO algorithm tends to explore the search space for possible solutions. Combining the advantages of HA and PSO, HPSO algorithms can find out the global optimum quickly and efficiently. It obtains the DM solution by seeking for the optimal assignment of missiles of friendly fighter aircrafts (FAs) to hostile FAs. Simulation results show that the proposed algorithm is superior to the general PSO algorithm and two GA based algorithms in searching for the best solution to the DM problem.
文摘The combat survivability is an essential factor to be considered in the development of recent military aircraft. Radar stealth and onboard electronic attack are two major techniques for the reduction of aircraft susceptibility. A tactical scenario for a strike mission is presented. The effect of aircraft radar cross section on the detection probability of a threat radar, as well as that of onboard jammer, are investigated. The guidance errors of radar guided surface to air missile and anti aircraft artillery, which are disturbed by radar cross section reduction or jammer radiated power and both of them are determined. The probability of aircraft kill given a single shot is calculated and finally the sortie survivability of an attack aircraft in a supposed hostile thread environment worked out. It is demonstrated that the survivability of a combat aircraft will be greatly enhanced by the combined radar stealth and onboard electronic attack, and the evaluation metho dology is effective and applicable.