Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventio...Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventional Reinforcement Learning(RL)algorithms,often focus on maximizing engagement outcomes through direct combat superiority.However,these methods overlook alternative tactics,such as inducing adversaries to crash,which can achieve decisive victories with lower risk and cost.This study proposes Alpha Crash,a novel distributional-rein forcement-learning-based agent specifically designed to defeat opponents by leveraging crash induction strategies.The approach integrates an improved QR-DQN framework to address uncertainties and adversarial tactics,incorporating advanced pilot experience into its reward functions.Extensive simulations reveal Alpha Crash's robust performance,achieving a 91.2%win rate across diverse scenarios by effectively guiding opponents into critical errors.Visualization and altitude analyses illustrate the agent's three-stage crash induction strategies that exploit adversaries'vulnerabilities.These findings underscore Alpha Crash's potential to enhance autonomous decision-making and strategic innovation in real-world air combat applications.展开更多
Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes...Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes a novel Dominant and Non-dominant strategy sample selection(DoNot)mechanism and a Local Observation Enhanced Multi-Agent Proximal Policy Optimization(LOE-MAPPO)algorithm to train the multi-UCAV air combat policy and improve its generalization.Specifically,the LOE-MAPPO algorithm adopts a mixed state that concatenates the global state and individual agent's local observation to enable efficient value function learning in multi-UCAV air combat.The DoNot mechanism classifies opponents into dominant or non-dominant strategy opponents,and samples from easier to more challenging opponents to form an adaptive training curriculum.Empirical results demonstrate that the proposed LOE-MAPPO algorithm outperforms baseline MARL algorithms in multi-UCAV air combat scenarios,and the DoNot mechanism leads to stronger policy generalization when facing diverse opponents.The results pave the way for the fast generation of cooperative strategies for air combat agents with MARLalgorithms.展开更多
The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),...The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),which can be abstracted as a heterogeneous combat network(HCN).It is of great military significance to study the disintegration strategy of combat networks to achieve the breakdown of the enemy’s CSoS.To this end,this paper proposes an integrated framework called HCN disintegration based on double deep Q-learning(HCN-DDQL).Firstly,the enemy’s CSoS is abstracted as an HCN,and an evaluation index based on the capability and attack costs of nodes is proposed.Meanwhile,a mathematical optimization model for HCN disintegration is established.Secondly,the learning environment and double deep Q-network model of HCN-DDQL are established to train the HCN’s disintegration strategy.Then,based on the learned HCN-DDQL model,an algorithm for calculating the HCN’s optimal disintegration strategy under different states is proposed.Finally,a case study is used to demonstrate the reliability and effectiveness of HCNDDQL,and the results demonstrate that HCN-DDQL can disintegrate HCNs more effectively than baseline methods.展开更多
The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in th...The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in this domain,the existing methods often lack strategic depth and generalization in complex,high-dimensional environments.To address these limitations,this paper proposes an optimized self-play method enhanced by advancements in fighter modeling,neural network design,and algorithmic frameworks.This study employs a six-degree-of-freedom(6-DOF)F-16 fighter model based on open-source aerodynamic data,featuring airborne equipment and a realistic visual simulation platform,unlike traditional 3-DOF models.To capture temporal dynamics,Long Short-Term Memory(LSTM)layers are integrated into the neural network,complemented by delayed input stacking.The RL environment incorporates expert strategies,curiositydriven rewards,and curriculum learning to improve adaptability and strategic decision-making.Experimental results demonstrate that the proposed approach achieves a winning rate exceeding90%against classical single-agent methods.Additionally,through enhanced 3D visual platforms,we conducted human-agent confrontation experiments,where the agent attained an average winning rate of over 75%.The agent's maneuver trajectories closely align with human pilot strategies,showcasing its potential in decision-making and pilot training applications.This study highlights the effectiveness of integrating advanced modeling and self-play techniques in developing robust air combat decision-making systems.展开更多
During its interaction with modern sports,traditional Wushu has faced increasing doubts about its combat effectiveness,raising concerns about its cultural identity.How traditional Wushu is understood as a combat art n...During its interaction with modern sports,traditional Wushu has faced increasing doubts about its combat effectiveness,raising concerns about its cultural identity.How traditional Wushu is understood as a combat art not only helps define its cultural essence but also carries important implications for its long-term development.It is an objective fact that combat represents the practical manifestation of traditional Wushu in history.Combat reflects similarities among traditional Wushu forms that emerged throughout history.Combat reflects the historical law governing the evolution of traditional Wushu and represents an abstraction of repetitive phenomena in traditional Wushu.A correct understanding of this objectivity,these similarities,and this repeatability is conducive to promoting and carrying forward traditional Wushu,thereby facilitating an objective analysis of differences among different traditional Wushu forms and the discovery of their evolution paradigm.In the contemporary context,it is essential for traditional Wushu to emphasize its distinctive cultural roots,thereby facilitating creative transformation and innovative development.展开更多
To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartogra...To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartography of heterogeneous combat networks based on the operational chain”(FCBOC).In this framework,a functional module detection algorithm named operational chain-based label propagation algorithm(OCLPA),which considers the cooperation and interactions among combat entities and can thus naturally tackle network heterogeneity,is proposed to identify the functional modules of the network.Then,the nodes and their modules are classified into different roles according to their properties.A case study shows that FCBOC can provide a simplified description of disorderly information of combat networks and enable us to identify their functional and structural network characteristics.The results provide useful information to help commanders make precise and accurate decisions regarding the protection,disintegration or optimization of combat networks.Three algorithms are also compared with OCLPA to show that FCBOC can most effectively find functional modules with practical meaning.展开更多
Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its ...Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its failure to consider the intention and event of the target,resulting in inaccurate assessment results.In view of this,an integrated threat assessment method is proposed to address the existing problems,such as overly subjective determination of index weight and imbalance of situation.The process and characteristics of BVR air combat are analyzed to establish a threat assessment model in terms of target intention,event,situation,and capability.On this basis,a distributed weight-solving algorithm is proposed to determine index and attribute weight respectively.Then,variable weight and game theory are introduced to effectively deal with the situation imbalance and achieve the combination of subjective and objective.The performance of the model and algorithm is evaluated through multiple simulation experiments.The assessment results demonstrate the accuracy of the proposed method in BVR air combat,indicating its potential practical significance in real air combat scenarios.展开更多
Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance ...Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance in cooperative decision-making,it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed.Aiming to solve this problem,this paper proposes an Advantage Highlight Multi-Agent Proximal Policy Optimization(AHMAPPO)algorithm.First,at every step,the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it.Then,the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency.Finally,the simulation results reveal that compared with some state-of-the-art MARL algorithms,the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper,which can reflect the critical features of BVR air combat.The AHMAPPO can significantly increase the convergence efficiency of the strategy for UCAV formation in BVR air combat,with a maximum increase of 81.5%relative to other algorithms.展开更多
Reinforcement learning has been applied to air combat problems in recent years,and the idea of curriculum learning is often used for reinforcement learning,but traditional curriculum learning suffers from the problem ...Reinforcement learning has been applied to air combat problems in recent years,and the idea of curriculum learning is often used for reinforcement learning,but traditional curriculum learning suffers from the problem of plasticity loss in neural networks.Plasticity loss is the difficulty of learning new knowledge after the network has converged.To this end,we propose a motivational curriculum learning distributed proximal policy optimization(MCLDPPO)algorithm,through which trained agents can significantly outperform the predictive game tree and mainstream reinforcement learning methods.The motivational curriculum learning is designed to help the agent gradually improve its combat ability by observing the agent's unsatisfactory performance and providing appropriate rewards as a guide.Furthermore,a complete tactical maneuver is encapsulated based on the existing air combat knowledge,and through the flexible use of these maneuvers,some tactics beyond human knowledge can be realized.In addition,we designed an interruption mechanism for the agent to increase the frequency of decisionmaking when the agent faces an emergency.When the number of threats received by the agent changes,the current action is interrupted in order to reacquire observations and make decisions again.Using the interruption mechanism can significantly improve the performance of the agent.To simulate actual air combat better,we use digital twin technology to simulate real air battles and propose a parallel battlefield mechanism that can run multiple simulation environments simultaneously,effectively improving data throughput.The experimental results demonstrate that the agent can fully utilize the situational information to make reasonable decisions and provide tactical adaptation in the air combat,verifying the effectiveness of the algorithmic framework proposed in this paper.展开更多
Today’s air combat has reached a high level of uncertainty where continuous or discrete variables with crisp values cannot be properly represented using fuzzy sets. With a set of membership functions, fuzzy logic is ...Today’s air combat has reached a high level of uncertainty where continuous or discrete variables with crisp values cannot be properly represented using fuzzy sets. With a set of membership functions, fuzzy logic is well-suited to tackle such complex states and actions. However, it is not necessary to fuzzify the variables that have definite discrete semantics.Hence, the aim of this study is to improve the level of model abstraction by proposing multiple levels of cascaded hierarchical structures from the perspective of function, namely, the functional decision tree. This method is developed to represent behavioral modeling of air combat systems, and its metamodel,execution mechanism, and code generation can provide a sound basis for function-based behavioral modeling. As a proof of concept, an air combat simulation is developed to validate this method and the results show that the fighter Alpha built using the proposed framework provides better performance than that using default scripts.展开更多
With continuous growth in scale,topology complexity,mission phases,and mission diversity,challenges have been placed for efficient capability evaluation of modern combat systems.Aiming at the problems of insufficient ...With continuous growth in scale,topology complexity,mission phases,and mission diversity,challenges have been placed for efficient capability evaluation of modern combat systems.Aiming at the problems of insufficient mission consideration and single evaluation dimension in the existing evaluation approaches,this study proposes a mission-oriented capability evaluation method for combat systems based on operation loop.Firstly,a combat network model is given that takes into account the capability properties of combat nodes.Then,based on the transition matrix between combat nodes,an efficient algorithm for operation loop identification is proposed based on the Breadth-First Search.Given the mission-capability satisfaction of nodes,the effectiveness evaluation indexes for operation loops and combat network are proposed,followed by node importance measure.Through a case study of the combat scenario involving space-based support against surface ships under different strategies,the effectiveness of the proposed method is verified.The results indicated that the ROI-priority attack method has a notable impact on reducing the overall efficiency of the network,whereas the O-L betweenness-priority attack is more effective in obstructing the successful execution of enemy attack missions.展开更多
In the air combat process,confrontation position is the critical factor to determine the confrontation situation,attack effect and escape probability of UAVs.Therefore,selecting the optimal confrontation position beco...In the air combat process,confrontation position is the critical factor to determine the confrontation situation,attack effect and escape probability of UAVs.Therefore,selecting the optimal confrontation position becomes the primary goal of maneuver decision-making.By taking the position as the UAV’s maneuver strategy,this paper constructs the optimal confrontation position selecting games(OCPSGs)model.In the OCPSGs model,the payoff function of each UAV is defined by the difference between the comprehensive advantages of both sides,and the strategy space of each UAV at every step is defined by its accessible space determined by the maneuverability.Then we design the limit approximation of mixed strategy Nash equilibrium(LAMSNQ)algorithm,which provides a method to determine the optimal probability distribution of positions in the strategy space.In the simulation phase,we assume the motions on three directions are independent and the strategy space is a cuboid to simplify the model.Several simulations are performed to verify the feasibility,effectiveness and stability of the algorithm.展开更多
Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the comb...Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy.展开更多
Lying in her makeshift hospital bed,Joyce Tembo thanked medical personnel for evacuating her to the designated national cholera treatment centre,6 km north of Zambia’s capital Lusaka.She was recently diagnosed with d...Lying in her makeshift hospital bed,Joyce Tembo thanked medical personnel for evacuating her to the designated national cholera treatment centre,6 km north of Zambia’s capital Lusaka.She was recently diagnosed with diarrhoeal disease.Tembo,43,commended the medical sta!stationed at the treatment centre for their great service to thousands of patients,especially women and children seeking urgent treatment.“I am very grateful to the Chinese doctors who attended to me as soon as the ambulance rushed me to the clinic where I received urgent treatment;they have really saved my life,”Tembo told ChinAfrica.But not all residents in her community are as lucky as her.Many in the densely populated slums die every day due to the area’s poor sanitation-one of the major causes of the cholera outbreak.展开更多
Since free combat is a competitive sport that flexibly utilizes kicking,punching,wrestling,and holding techniques to defeat the opponent,a good core strength of athletes can help to improve the technical level,enhance...Since free combat is a competitive sport that flexibly utilizes kicking,punching,wrestling,and holding techniques to defeat the opponent,a good core strength of athletes can help to improve the technical level,enhance the quality of movements,and protect the joints and muscles.In order to carry out core strength training in free combat teaching with high quality,firstly,it is necessary for coaches to carry out simple training,centralized training,and extended training according to the basic planning of adaptation-stabilization-improvement.Secondly,it is also important to test the athlete’s physical and athletic qualities before implementing the specific training plan,optimize the training program,and carry out statistical analysis of the stage training data in order to achieve the best training effect.展开更多
To solve the problem of realizing autonomous aerial combat decision-making for unmanned combat aerial vehicles(UCAVs) rapidly and accurately in an uncertain environment, this paper proposes a decision-making method ba...To solve the problem of realizing autonomous aerial combat decision-making for unmanned combat aerial vehicles(UCAVs) rapidly and accurately in an uncertain environment, this paper proposes a decision-making method based on an improved deep reinforcement learning(DRL) algorithm: the multistep double deep Q-network(MS-DDQN) algorithm. First, a six-degree-of-freedom UCAV model based on an aircraft control system is established on a simulation platform, and the situation assessment functions of the UCAV and its target are established by considering their angles, altitudes, environments, missile attack performances, and UCAV performance. By controlling the flight path angle, roll angle, and flight velocity, 27 common basic actions are designed. On this basis, aiming to overcome the defects of traditional DRL in terms of training speed and convergence speed, the improved MS-DDQN method is introduced to incorporate the final return value into the previous steps. Finally, the pre-training learning model is used as the starting point for the second learning model to simulate the UCAV aerial combat decision-making process based on the basic training method, which helps to shorten the training time and improve the learning efficiency. The improved DRL algorithm significantly accelerates the training speed and estimates the target value more accurately during training, and it can be applied to aerial combat decision-making.展开更多
Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm opt...Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm optimization (HPSO) algorithm is proposed to solve the decision-making (DM) problem. HA facilitates to search the local optimum in the neighborhood of a solution, while the PSO algorithm tends to explore the search space for possible solutions. Combining the advantages of HA and PSO, HPSO algorithms can find out the global optimum quickly and efficiently. It obtains the DM solution by seeking for the optimal assignment of missiles of friendly fighter aircrafts (FAs) to hostile FAs. Simulation results show that the proposed algorithm is superior to the general PSO algorithm and two GA based algorithms in searching for the best solution to the DM problem.展开更多
At evaluating the combat effectiveness of the defense system, target′s probability to penetrate the defended area is a primary care taking index. In this paper, stochastic model to compete the probability that targe...At evaluating the combat effectiveness of the defense system, target′s probability to penetrate the defended area is a primary care taking index. In this paper, stochastic model to compete the probability that target penetrates the defended area along any flight path is established by the state analysis and statistical equilibrium analysis of stochastic service system theory. The simulated annealing algorithm is an enlightening random search method based on Monte Carlo recursion, and it can find global optimal solution by simulating annealing process. Combining stochastic model to compete the probability and simulated annealing algorithm, this paper establishes the method to solve problem quantitatively about combat configuration optimization of weapon systems. The calculated result shows that the perfect configuration for fire cells of the weapon is fast found by using this method, and this quantificational method for combat configuration is faster and more scientific than previous one based on principle via map fire field.展开更多
The combat survivability is an essential factor to be considered in the development of recent military aircraft. Radar stealth and onboard electronic attack are two major techniques for the reduction of aircraft susce...The combat survivability is an essential factor to be considered in the development of recent military aircraft. Radar stealth and onboard electronic attack are two major techniques for the reduction of aircraft susceptibility. A tactical scenario for a strike mission is presented. The effect of aircraft radar cross section on the detection probability of a threat radar, as well as that of onboard jammer, are investigated. The guidance errors of radar guided surface to air missile and anti aircraft artillery, which are disturbed by radar cross section reduction or jammer radiated power and both of them are determined. The probability of aircraft kill given a single shot is calculated and finally the sortie survivability of an attack aircraft in a supposed hostile thread environment worked out. It is demonstrated that the survivability of a combat aircraft will be greatly enhanced by the combined radar stealth and onboard electronic attack, and the evaluation metho dology is effective and applicable.展开更多
A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrum...A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrument, the problem of air combat situation assessment is equivalent to the situation classification problem of air combat data. The fuzzy C-means clustering algorithm is proposed to cluster the selected air combat sample data and the situation classification of the data is determined by the data correlation analysis in combination with the clustering results and the pilots' description of the air combat process. On the basis of semi-supervised naive Bayes classifier, an improved algorithm is proposed based on data classification confidence, through which the situation classification of air combat data is carried out. The simulation results show that the improved algorithm can assess the air combat situation effectively and the improvement of the algorithm can promote the classification performance without significantly affecting the efficiency of the classifier.展开更多
基金supported by the National Key R&D Program of China(No.2021YFB3300602)。
文摘Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventional Reinforcement Learning(RL)algorithms,often focus on maximizing engagement outcomes through direct combat superiority.However,these methods overlook alternative tactics,such as inducing adversaries to crash,which can achieve decisive victories with lower risk and cost.This study proposes Alpha Crash,a novel distributional-rein forcement-learning-based agent specifically designed to defeat opponents by leveraging crash induction strategies.The approach integrates an improved QR-DQN framework to address uncertainties and adversarial tactics,incorporating advanced pilot experience into its reward functions.Extensive simulations reveal Alpha Crash's robust performance,achieving a 91.2%win rate across diverse scenarios by effectively guiding opponents into critical errors.Visualization and altitude analyses illustrate the agent's three-stage crash induction strategies that exploit adversaries'vulnerabilities.These findings underscore Alpha Crash's potential to enhance autonomous decision-making and strategic innovation in real-world air combat applications.
文摘Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes a novel Dominant and Non-dominant strategy sample selection(DoNot)mechanism and a Local Observation Enhanced Multi-Agent Proximal Policy Optimization(LOE-MAPPO)algorithm to train the multi-UCAV air combat policy and improve its generalization.Specifically,the LOE-MAPPO algorithm adopts a mixed state that concatenates the global state and individual agent's local observation to enable efficient value function learning in multi-UCAV air combat.The DoNot mechanism classifies opponents into dominant or non-dominant strategy opponents,and samples from easier to more challenging opponents to form an adaptive training curriculum.Empirical results demonstrate that the proposed LOE-MAPPO algorithm outperforms baseline MARL algorithms in multi-UCAV air combat scenarios,and the DoNot mechanism leads to stronger policy generalization when facing diverse opponents.The results pave the way for the fast generation of cooperative strategies for air combat agents with MARLalgorithms.
基金supported by the National Natural Science Foundation of China(7200120972231011+2 种基金72071206)the Science and Technology Innovative Research Team in Higher Educational Institutions of Hunan Province(2020RC4046)the Science Foundation for Outstanding Youth Scholars of Hunan Province(2022JJ20047).
文摘The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),which can be abstracted as a heterogeneous combat network(HCN).It is of great military significance to study the disintegration strategy of combat networks to achieve the breakdown of the enemy’s CSoS.To this end,this paper proposes an integrated framework called HCN disintegration based on double deep Q-learning(HCN-DDQL).Firstly,the enemy’s CSoS is abstracted as an HCN,and an evaluation index based on the capability and attack costs of nodes is proposed.Meanwhile,a mathematical optimization model for HCN disintegration is established.Secondly,the learning environment and double deep Q-network model of HCN-DDQL are established to train the HCN’s disintegration strategy.Then,based on the learned HCN-DDQL model,an algorithm for calculating the HCN’s optimal disintegration strategy under different states is proposed.Finally,a case study is used to demonstrate the reliability and effectiveness of HCNDDQL,and the results demonstrate that HCN-DDQL can disintegrate HCNs more effectively than baseline methods.
基金co-supported by the National Natural Science Foundation of China(No.91852115)。
文摘The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in this domain,the existing methods often lack strategic depth and generalization in complex,high-dimensional environments.To address these limitations,this paper proposes an optimized self-play method enhanced by advancements in fighter modeling,neural network design,and algorithmic frameworks.This study employs a six-degree-of-freedom(6-DOF)F-16 fighter model based on open-source aerodynamic data,featuring airborne equipment and a realistic visual simulation platform,unlike traditional 3-DOF models.To capture temporal dynamics,Long Short-Term Memory(LSTM)layers are integrated into the neural network,complemented by delayed input stacking.The RL environment incorporates expert strategies,curiositydriven rewards,and curriculum learning to improve adaptability and strategic decision-making.Experimental results demonstrate that the proposed approach achieves a winning rate exceeding90%against classical single-agent methods.Additionally,through enhanced 3D visual platforms,we conducted human-agent confrontation experiments,where the agent attained an average winning rate of over 75%.The agent's maneuver trajectories closely align with human pilot strategies,showcasing its potential in decision-making and pilot training applications.This study highlights the effectiveness of integrating advanced modeling and self-play techniques in developing robust air combat decision-making systems.
文摘During its interaction with modern sports,traditional Wushu has faced increasing doubts about its combat effectiveness,raising concerns about its cultural identity.How traditional Wushu is understood as a combat art not only helps define its cultural essence but also carries important implications for its long-term development.It is an objective fact that combat represents the practical manifestation of traditional Wushu in history.Combat reflects similarities among traditional Wushu forms that emerged throughout history.Combat reflects the historical law governing the evolution of traditional Wushu and represents an abstraction of repetitive phenomena in traditional Wushu.A correct understanding of this objectivity,these similarities,and this repeatability is conducive to promoting and carrying forward traditional Wushu,thereby facilitating an objective analysis of differences among different traditional Wushu forms and the discovery of their evolution paradigm.In the contemporary context,it is essential for traditional Wushu to emphasize its distinctive cultural roots,thereby facilitating creative transformation and innovative development.
文摘To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartography of heterogeneous combat networks based on the operational chain”(FCBOC).In this framework,a functional module detection algorithm named operational chain-based label propagation algorithm(OCLPA),which considers the cooperation and interactions among combat entities and can thus naturally tackle network heterogeneity,is proposed to identify the functional modules of the network.Then,the nodes and their modules are classified into different roles according to their properties.A case study shows that FCBOC can provide a simplified description of disorderly information of combat networks and enable us to identify their functional and structural network characteristics.The results provide useful information to help commanders make precise and accurate decisions regarding the protection,disintegration or optimization of combat networks.Three algorithms are also compared with OCLPA to show that FCBOC can most effectively find functional modules with practical meaning.
基金National Natural Science Foundation of China(62006193,62103338)Aeronautical Science Foundation of China(2022Z023053001)+1 种基金Key Research and Development Program of Shaanxi Province(2024GX-YBXM-115)Fundamental Research Funds for the Central Universities(D5000230150)。
文摘Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its failure to consider the intention and event of the target,resulting in inaccurate assessment results.In view of this,an integrated threat assessment method is proposed to address the existing problems,such as overly subjective determination of index weight and imbalance of situation.The process and characteristics of BVR air combat are analyzed to establish a threat assessment model in terms of target intention,event,situation,and capability.On this basis,a distributed weight-solving algorithm is proposed to determine index and attribute weight respectively.Then,variable weight and game theory are introduced to effectively deal with the situation imbalance and achieve the combination of subjective and objective.The performance of the model and algorithm is evaluated through multiple simulation experiments.The assessment results demonstrate the accuracy of the proposed method in BVR air combat,indicating its potential practical significance in real air combat scenarios.
基金co-supported by the National Natural Science Foundation of China(No.52272382)the Aeronautical Science Foundation of China(No.20200017051001)the Fundamental Research Funds for the Central Universities,China.
文摘Highly intelligent Unmanned Combat Aerial Vehicle(UCAV)formation is expected to bring out strengths in Beyond-Visual-Range(BVR)air combat.Although Multi-Agent Reinforcement Learning(MARL)shows outstanding performance in cooperative decision-making,it is challenging for existing MARL algorithms to quickly converge to an optimal strategy for UCAV formation in BVR air combat where confrontation is complicated and reward is extremely sparse and delayed.Aiming to solve this problem,this paper proposes an Advantage Highlight Multi-Agent Proximal Policy Optimization(AHMAPPO)algorithm.First,at every step,the AHMAPPO records the degree to which the best formation exceeds the average of formations in parallel environments and carries out additional advantage sampling according to it.Then,the sampling result is introduced into the updating process of the actor network to improve its optimization efficiency.Finally,the simulation results reveal that compared with some state-of-the-art MARL algorithms,the AHMAPPO can obtain a more excellent strategy utilizing fewer sample episodes in the UCAV formation BVR air combat simulation environment built in this paper,which can reflect the critical features of BVR air combat.The AHMAPPO can significantly increase the convergence efficiency of the strategy for UCAV formation in BVR air combat,with a maximum increase of 81.5%relative to other algorithms.
文摘Reinforcement learning has been applied to air combat problems in recent years,and the idea of curriculum learning is often used for reinforcement learning,but traditional curriculum learning suffers from the problem of plasticity loss in neural networks.Plasticity loss is the difficulty of learning new knowledge after the network has converged.To this end,we propose a motivational curriculum learning distributed proximal policy optimization(MCLDPPO)algorithm,through which trained agents can significantly outperform the predictive game tree and mainstream reinforcement learning methods.The motivational curriculum learning is designed to help the agent gradually improve its combat ability by observing the agent's unsatisfactory performance and providing appropriate rewards as a guide.Furthermore,a complete tactical maneuver is encapsulated based on the existing air combat knowledge,and through the flexible use of these maneuvers,some tactics beyond human knowledge can be realized.In addition,we designed an interruption mechanism for the agent to increase the frequency of decisionmaking when the agent faces an emergency.When the number of threats received by the agent changes,the current action is interrupted in order to reacquire observations and make decisions again.Using the interruption mechanism can significantly improve the performance of the agent.To simulate actual air combat better,we use digital twin technology to simulate real air battles and propose a parallel battlefield mechanism that can run multiple simulation environments simultaneously,effectively improving data throughput.The experimental results demonstrate that the agent can fully utilize the situational information to make reasonable decisions and provide tactical adaptation in the air combat,verifying the effectiveness of the algorithmic framework proposed in this paper.
基金This work was supported by the National Natural Science Foundation of China(62003359).
文摘Today’s air combat has reached a high level of uncertainty where continuous or discrete variables with crisp values cannot be properly represented using fuzzy sets. With a set of membership functions, fuzzy logic is well-suited to tackle such complex states and actions. However, it is not necessary to fuzzify the variables that have definite discrete semantics.Hence, the aim of this study is to improve the level of model abstraction by proposing multiple levels of cascaded hierarchical structures from the perspective of function, namely, the functional decision tree. This method is developed to represent behavioral modeling of air combat systems, and its metamodel,execution mechanism, and code generation can provide a sound basis for function-based behavioral modeling. As a proof of concept, an air combat simulation is developed to validate this method and the results show that the fighter Alpha built using the proposed framework provides better performance than that using default scripts.
文摘With continuous growth in scale,topology complexity,mission phases,and mission diversity,challenges have been placed for efficient capability evaluation of modern combat systems.Aiming at the problems of insufficient mission consideration and single evaluation dimension in the existing evaluation approaches,this study proposes a mission-oriented capability evaluation method for combat systems based on operation loop.Firstly,a combat network model is given that takes into account the capability properties of combat nodes.Then,based on the transition matrix between combat nodes,an efficient algorithm for operation loop identification is proposed based on the Breadth-First Search.Given the mission-capability satisfaction of nodes,the effectiveness evaluation indexes for operation loops and combat network are proposed,followed by node importance measure.Through a case study of the combat scenario involving space-based support against surface ships under different strategies,the effectiveness of the proposed method is verified.The results indicated that the ROI-priority attack method has a notable impact on reducing the overall efficiency of the network,whereas the O-L betweenness-priority attack is more effective in obstructing the successful execution of enemy attack missions.
基金National Key R&D Program of China(Grant No.2021YFA1000402)National Natural Science Foundation of China(Grant No.72071159)to provide fund for conducting experiments。
文摘In the air combat process,confrontation position is the critical factor to determine the confrontation situation,attack effect and escape probability of UAVs.Therefore,selecting the optimal confrontation position becomes the primary goal of maneuver decision-making.By taking the position as the UAV’s maneuver strategy,this paper constructs the optimal confrontation position selecting games(OCPSGs)model.In the OCPSGs model,the payoff function of each UAV is defined by the difference between the comprehensive advantages of both sides,and the strategy space of each UAV at every step is defined by its accessible space determined by the maneuverability.Then we design the limit approximation of mixed strategy Nash equilibrium(LAMSNQ)algorithm,which provides a method to determine the optimal probability distribution of positions in the strategy space.In the simulation phase,we assume the motions on three directions are independent and the strategy space is a cuboid to simplify the model.Several simulations are performed to verify the feasibility,effectiveness and stability of the algorithm.
文摘Future unmanned battles desperately require intelli-gent combat policies,and multi-agent reinforcement learning offers a promising solution.However,due to the complexity of combat operations and large size of the combat group,this task suffers from credit assignment problem more than other rein-forcement learning tasks.This study uses reward shaping to relieve the credit assignment problem and improve policy train-ing for the new generation of large-scale unmanned combat operations.We first prove that multiple reward shaping func-tions would not change the Nash Equilibrium in stochastic games,providing theoretical support for their use.According to the characteristics of combat operations,we propose tactical reward shaping(TRS)that comprises maneuver shaping advice and threat assessment-based attack shaping advice.Then,we investigate the effects of different types and combinations of shaping advice on combat policies through experiments.The results show that TRS improves both the efficiency and attack accuracy of combat policies,with the combination of maneuver reward shaping advice and ally-focused attack shaping advice achieving the best performance compared with that of the base-line strategy.
文摘Lying in her makeshift hospital bed,Joyce Tembo thanked medical personnel for evacuating her to the designated national cholera treatment centre,6 km north of Zambia’s capital Lusaka.She was recently diagnosed with diarrhoeal disease.Tembo,43,commended the medical sta!stationed at the treatment centre for their great service to thousands of patients,especially women and children seeking urgent treatment.“I am very grateful to the Chinese doctors who attended to me as soon as the ambulance rushed me to the clinic where I received urgent treatment;they have really saved my life,”Tembo told ChinAfrica.But not all residents in her community are as lucky as her.Many in the densely populated slums die every day due to the area’s poor sanitation-one of the major causes of the cholera outbreak.
文摘Since free combat is a competitive sport that flexibly utilizes kicking,punching,wrestling,and holding techniques to defeat the opponent,a good core strength of athletes can help to improve the technical level,enhance the quality of movements,and protect the joints and muscles.In order to carry out core strength training in free combat teaching with high quality,firstly,it is necessary for coaches to carry out simple training,centralized training,and extended training according to the basic planning of adaptation-stabilization-improvement.Secondly,it is also important to test the athlete’s physical and athletic qualities before implementing the specific training plan,optimize the training program,and carry out statistical analysis of the stage training data in order to achieve the best training effect.
基金supported by the National Natural Science Foundation of China (No. 61573286)the Aeronautical Science Foundation of China (No. 20180753006)+2 种基金the Fundamental Research Funds for the Central Universities (3102019ZDHKY07)the Natural Science Foundation of Shaanxi Province (2019JM-163, 2020JQ-218)the Shaanxi Province Key Laboratory of Flight Control and Simulation Technology。
文摘To solve the problem of realizing autonomous aerial combat decision-making for unmanned combat aerial vehicles(UCAVs) rapidly and accurately in an uncertain environment, this paper proposes a decision-making method based on an improved deep reinforcement learning(DRL) algorithm: the multistep double deep Q-network(MS-DDQN) algorithm. First, a six-degree-of-freedom UCAV model based on an aircraft control system is established on a simulation platform, and the situation assessment functions of the UCAV and its target are established by considering their angles, altitudes, environments, missile attack performances, and UCAV performance. By controlling the flight path angle, roll angle, and flight velocity, 27 common basic actions are designed. On this basis, aiming to overcome the defects of traditional DRL in terms of training speed and convergence speed, the improved MS-DDQN method is introduced to incorporate the final return value into the previous steps. Finally, the pre-training learning model is used as the starting point for the second learning model to simulate the UCAV aerial combat decision-making process based on the basic training method, which helps to shorten the training time and improve the learning efficiency. The improved DRL algorithm significantly accelerates the training speed and estimates the target value more accurately during training, and it can be applied to aerial combat decision-making.
文摘Combining the heuristic algorithm (HA) developed based on the specific knowledge of the cooperative multiple target attack (CMTA) tactics and the particle swarm optimization (PSO), a heuristic particle swarm optimization (HPSO) algorithm is proposed to solve the decision-making (DM) problem. HA facilitates to search the local optimum in the neighborhood of a solution, while the PSO algorithm tends to explore the search space for possible solutions. Combining the advantages of HA and PSO, HPSO algorithms can find out the global optimum quickly and efficiently. It obtains the DM solution by seeking for the optimal assignment of missiles of friendly fighter aircrafts (FAs) to hostile FAs. Simulation results show that the proposed algorithm is superior to the general PSO algorithm and two GA based algorithms in searching for the best solution to the DM problem.
文摘At evaluating the combat effectiveness of the defense system, target′s probability to penetrate the defended area is a primary care taking index. In this paper, stochastic model to compete the probability that target penetrates the defended area along any flight path is established by the state analysis and statistical equilibrium analysis of stochastic service system theory. The simulated annealing algorithm is an enlightening random search method based on Monte Carlo recursion, and it can find global optimal solution by simulating annealing process. Combining stochastic model to compete the probability and simulated annealing algorithm, this paper establishes the method to solve problem quantitatively about combat configuration optimization of weapon systems. The calculated result shows that the perfect configuration for fire cells of the weapon is fast found by using this method, and this quantificational method for combat configuration is faster and more scientific than previous one based on principle via map fire field.
文摘The combat survivability is an essential factor to be considered in the development of recent military aircraft. Radar stealth and onboard electronic attack are two major techniques for the reduction of aircraft susceptibility. A tactical scenario for a strike mission is presented. The effect of aircraft radar cross section on the detection probability of a threat radar, as well as that of onboard jammer, are investigated. The guidance errors of radar guided surface to air missile and anti aircraft artillery, which are disturbed by radar cross section reduction or jammer radiated power and both of them are determined. The probability of aircraft kill given a single shot is calculated and finally the sortie survivability of an attack aircraft in a supposed hostile thread environment worked out. It is demonstrated that the survivability of a combat aircraft will be greatly enhanced by the combined radar stealth and onboard electronic attack, and the evaluation metho dology is effective and applicable.
基金supported by the Aviation Science Foundation of China(20152096019)
文摘A method is proposed to resolve the typical problem of air combat situation assessment. Taking the one-to-one air combat as an example and on the basis of air combat data recorded by the air combat maneuvering instrument, the problem of air combat situation assessment is equivalent to the situation classification problem of air combat data. The fuzzy C-means clustering algorithm is proposed to cluster the selected air combat sample data and the situation classification of the data is determined by the data correlation analysis in combination with the clustering results and the pilots' description of the air combat process. On the basis of semi-supervised naive Bayes classifier, an improved algorithm is proposed based on data classification confidence, through which the situation classification of air combat data is carried out. The simulation results show that the improved algorithm can assess the air combat situation effectively and the improvement of the algorithm can promote the classification performance without significantly affecting the efficiency of the classifier.