With the rapid development of artificial intelligence,intelligent air combat maneuver decision-making(ACMD)has garnered global attention.Although deep reinforcement learning provides a promising approach to ACMD,exist...With the rapid development of artificial intelligence,intelligent air combat maneuver decision-making(ACMD)has garnered global attention.Although deep reinforcement learning provides a promising approach to ACMD,existing methods often suffer from rigid reward functions and limited adaptability to evolving adversarial strategies.Moreover,most research assumes open airspace,overlooking the influence of potential obstacles.In this paper,we address one-on-one within-visual-range ACMD in obstructed environments,and propose an improved Soft Actor-Critic(SAC)algorithm trained under a curriculum self-play framework.A maneuver strategy mirroring inference module is integrated to estimate each other's likely positions when visual obstruction occurs.By leveraging curriculum learning to guide progressive experience accumulation and self-play for adversarial evolution,our method enhances both training efficiency and tactical diversity.We further integrate an attention mechanism that dynamically adjusts the weights of sub-rewards,enabling the learned policy to adapt to rapidly changing air combat situations.Numerical simulations demonstrate that our enhanced SAC converges more quickly and achieves higher win rates than other baseline methods.An animation is available at bilibili.com/video/BV1BHVszHE98 for better illustration.展开更多
Manned aerial vehicle-unmanned aerial vehicle(MAV-UAV)combat organization is a MAV-UAV combat collective formed from the perspective of organization design theory and methodology,and the generation of force formation ...Manned aerial vehicle-unmanned aerial vehicle(MAV-UAV)combat organization is a MAV-UAV combat collective formed from the perspective of organization design theory and methodology,and the generation of force formation plan is a key step in the organizational planning.Based on the description of the problem and the definition of organizational elements,the matching model of platform-target attack wave is constructed to minimize the redundancy of command and decision-making capability,resource capability and the number of platforms used.Based on the non-dominated sorting genetic algorithmⅢ(NSGA-Ⅲ)framework,which includes encoding/decoding method and constraint handling method,the generation model of organizational force formation plan is solved,and the effectiveness and superiority of the algorithm are verified by simulation experiments.展开更多
Unmanned Aerial Vehicle(UAV) trajectory prediction is an important research topic in the field of UAV air combat. In order to address the problem of single-feature extraction scale and scene adaptability in UAV air co...Unmanned Aerial Vehicle(UAV) trajectory prediction is an important research topic in the field of UAV air combat. In order to address the problem of single-feature extraction scale and scene adaptability in UAV air combat trajectory prediction algorithms, this paper proposes an innovative UAV trajectory prediction method QCNet-3D, which can predict the future trajectory of the target UAV and provide the corresponding possibility. Firstly, the UAV trajectory prediction is modeled based on the mixture of Laplace distributions, and the UAV's kinetic equations are employed to construct the UAV trajectory prediction dataset(UAVTP dataset), ensuring high reliability. Secondly, two improvement methods are proposed on the basis of QCNet: multi-scale Fourier mapping and three-dimensional adaptation. The ablation study shows that the improvement methods have reduced the minimum average displacement error, minimum final displacement error, and missing rate by 55.4%, 54.3%, and 68.1% respectively. Finally, QCNet-3D is proposed based on the two improvement methods, and the simulation experiment confirm the proposed algorithm's capability to predict both simple and complex UAV maneuvers, offering the possibility for each predicted trajectory under various prediction future steps and output modes.展开更多
Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventio...Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventional Reinforcement Learning(RL)algorithms,often focus on maximizing engagement outcomes through direct combat superiority.However,these methods overlook alternative tactics,such as inducing adversaries to crash,which can achieve decisive victories with lower risk and cost.This study proposes Alpha Crash,a novel distributional-rein forcement-learning-based agent specifically designed to defeat opponents by leveraging crash induction strategies.The approach integrates an improved QR-DQN framework to address uncertainties and adversarial tactics,incorporating advanced pilot experience into its reward functions.Extensive simulations reveal Alpha Crash's robust performance,achieving a 91.2%win rate across diverse scenarios by effectively guiding opponents into critical errors.Visualization and altitude analyses illustrate the agent's three-stage crash induction strategies that exploit adversaries'vulnerabilities.These findings underscore Alpha Crash's potential to enhance autonomous decision-making and strategic innovation in real-world air combat applications.展开更多
Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes...Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes a novel Dominant and Non-dominant strategy sample selection(DoNot)mechanism and a Local Observation Enhanced Multi-Agent Proximal Policy Optimization(LOE-MAPPO)algorithm to train the multi-UCAV air combat policy and improve its generalization.Specifically,the LOE-MAPPO algorithm adopts a mixed state that concatenates the global state and individual agent's local observation to enable efficient value function learning in multi-UCAV air combat.The DoNot mechanism classifies opponents into dominant or non-dominant strategy opponents,and samples from easier to more challenging opponents to form an adaptive training curriculum.Empirical results demonstrate that the proposed LOE-MAPPO algorithm outperforms baseline MARL algorithms in multi-UCAV air combat scenarios,and the DoNot mechanism leads to stronger policy generalization when facing diverse opponents.The results pave the way for the fast generation of cooperative strategies for air combat agents with MARLalgorithms.展开更多
The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),...The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),which can be abstracted as a heterogeneous combat network(HCN).It is of great military significance to study the disintegration strategy of combat networks to achieve the breakdown of the enemy’s CSoS.To this end,this paper proposes an integrated framework called HCN disintegration based on double deep Q-learning(HCN-DDQL).Firstly,the enemy’s CSoS is abstracted as an HCN,and an evaluation index based on the capability and attack costs of nodes is proposed.Meanwhile,a mathematical optimization model for HCN disintegration is established.Secondly,the learning environment and double deep Q-network model of HCN-DDQL are established to train the HCN’s disintegration strategy.Then,based on the learned HCN-DDQL model,an algorithm for calculating the HCN’s optimal disintegration strategy under different states is proposed.Finally,a case study is used to demonstrate the reliability and effectiveness of HCNDDQL,and the results demonstrate that HCN-DDQL can disintegrate HCNs more effectively than baseline methods.展开更多
The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in th...The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in this domain,the existing methods often lack strategic depth and generalization in complex,high-dimensional environments.To address these limitations,this paper proposes an optimized self-play method enhanced by advancements in fighter modeling,neural network design,and algorithmic frameworks.This study employs a six-degree-of-freedom(6-DOF)F-16 fighter model based on open-source aerodynamic data,featuring airborne equipment and a realistic visual simulation platform,unlike traditional 3-DOF models.To capture temporal dynamics,Long Short-Term Memory(LSTM)layers are integrated into the neural network,complemented by delayed input stacking.The RL environment incorporates expert strategies,curiositydriven rewards,and curriculum learning to improve adaptability and strategic decision-making.Experimental results demonstrate that the proposed approach achieves a winning rate exceeding90%against classical single-agent methods.Additionally,through enhanced 3D visual platforms,we conducted human-agent confrontation experiments,where the agent attained an average winning rate of over 75%.The agent's maneuver trajectories closely align with human pilot strategies,showcasing its potential in decision-making and pilot training applications.This study highlights the effectiveness of integrating advanced modeling and self-play techniques in developing robust air combat decision-making systems.展开更多
During its interaction with modern sports,traditional Wushu has faced increasing doubts about its combat effectiveness,raising concerns about its cultural identity.How traditional Wushu is understood as a combat art n...During its interaction with modern sports,traditional Wushu has faced increasing doubts about its combat effectiveness,raising concerns about its cultural identity.How traditional Wushu is understood as a combat art not only helps define its cultural essence but also carries important implications for its long-term development.It is an objective fact that combat represents the practical manifestation of traditional Wushu in history.Combat reflects similarities among traditional Wushu forms that emerged throughout history.Combat reflects the historical law governing the evolution of traditional Wushu and represents an abstraction of repetitive phenomena in traditional Wushu.A correct understanding of this objectivity,these similarities,and this repeatability is conducive to promoting and carrying forward traditional Wushu,thereby facilitating an objective analysis of differences among different traditional Wushu forms and the discovery of their evolution paradigm.In the contemporary context,it is essential for traditional Wushu to emphasize its distinctive cultural roots,thereby facilitating creative transformation and innovative development.展开更多
To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartogra...To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartography of heterogeneous combat networks based on the operational chain”(FCBOC).In this framework,a functional module detection algorithm named operational chain-based label propagation algorithm(OCLPA),which considers the cooperation and interactions among combat entities and can thus naturally tackle network heterogeneity,is proposed to identify the functional modules of the network.Then,the nodes and their modules are classified into different roles according to their properties.A case study shows that FCBOC can provide a simplified description of disorderly information of combat networks and enable us to identify their functional and structural network characteristics.The results provide useful information to help commanders make precise and accurate decisions regarding the protection,disintegration or optimization of combat networks.Three algorithms are also compared with OCLPA to show that FCBOC can most effectively find functional modules with practical meaning.展开更多
Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its ...Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its failure to consider the intention and event of the target,resulting in inaccurate assessment results.In view of this,an integrated threat assessment method is proposed to address the existing problems,such as overly subjective determination of index weight and imbalance of situation.The process and characteristics of BVR air combat are analyzed to establish a threat assessment model in terms of target intention,event,situation,and capability.On this basis,a distributed weight-solving algorithm is proposed to determine index and attribute weight respectively.Then,variable weight and game theory are introduced to effectively deal with the situation imbalance and achieve the combination of subjective and objective.The performance of the model and algorithm is evaluated through multiple simulation experiments.The assessment results demonstrate the accuracy of the proposed method in BVR air combat,indicating its potential practical significance in real air combat scenarios.展开更多
随着人工智能技术的快速发展,大型语言模型(LLMs)在认知、推理与决策等方面表现出强大能力,为兵棋推演系统的智能化升级提供了新的机遇。针对单一智能决策方法在态势理解、智能决策和对抗能力不足等问题。论文提出了一种基于大模型的海...随着人工智能技术的快速发展,大型语言模型(LLMs)在认知、推理与决策等方面表现出强大能力,为兵棋推演系统的智能化升级提供了新的机遇。针对单一智能决策方法在态势理解、智能决策和对抗能力不足等问题。论文提出了一种基于大模型的海上作战兵棋推演智能决策框架,该框架融合感知理解、知识工程、透明推理与分层记忆等关键技术,构建了多层次的“感知-理解-决策-执行”全流程架构。创新性地设计了面向作战任务的战术知识库、基于思维链的推理机制和记忆库,实现了对复杂战场环境下的智能辅助任务规划。框架在Command Modern Operations(CMO)仿真平台上进行实验,针对典型海上作战任务开展了对抗推演。结果表明,论文所提出的框架在任务完成度、资源利用效率、决策合理性与可解释性等方面较传统方法显著提升,具备良好的实战应用价值。展开更多
基金support of the National Key Research and Development Plan(No.2021YFB3302501)the financial support of the National Science Foundation of China(No.12161076)the financial support of the Fundamental Research Funds for the Central Universities(No.DUT25GF207).
文摘With the rapid development of artificial intelligence,intelligent air combat maneuver decision-making(ACMD)has garnered global attention.Although deep reinforcement learning provides a promising approach to ACMD,existing methods often suffer from rigid reward functions and limited adaptability to evolving adversarial strategies.Moreover,most research assumes open airspace,overlooking the influence of potential obstacles.In this paper,we address one-on-one within-visual-range ACMD in obstructed environments,and propose an improved Soft Actor-Critic(SAC)algorithm trained under a curriculum self-play framework.A maneuver strategy mirroring inference module is integrated to estimate each other's likely positions when visual obstruction occurs.By leveraging curriculum learning to guide progressive experience accumulation and self-play for adversarial evolution,our method enhances both training efficiency and tactical diversity.We further integrate an attention mechanism that dynamically adjusts the weights of sub-rewards,enabling the learned policy to adapt to rapidly changing air combat situations.Numerical simulations demonstrate that our enhanced SAC converges more quickly and achieves higher win rates than other baseline methods.An animation is available at bilibili.com/video/BV1BHVszHE98 for better illustration.
基金supported by the Natural Science Foundation of Shaanxi Province(2023-JC-QN-0728)the China Postdoctoral Science Foundation(2021M693942)。
文摘Manned aerial vehicle-unmanned aerial vehicle(MAV-UAV)combat organization is a MAV-UAV combat collective formed from the perspective of organization design theory and methodology,and the generation of force formation plan is a key step in the organizational planning.Based on the description of the problem and the definition of organizational elements,the matching model of platform-target attack wave is constructed to minimize the redundancy of command and decision-making capability,resource capability and the number of platforms used.Based on the non-dominated sorting genetic algorithmⅢ(NSGA-Ⅲ)framework,which includes encoding/decoding method and constraint handling method,the generation model of organizational force formation plan is solved,and the effectiveness and superiority of the algorithm are verified by simulation experiments.
基金National Natural Science Foundation (NSF) of China (No.61976014)the Aeronautical Science Foundation of China (2022Z071051001)。
文摘Unmanned Aerial Vehicle(UAV) trajectory prediction is an important research topic in the field of UAV air combat. In order to address the problem of single-feature extraction scale and scene adaptability in UAV air combat trajectory prediction algorithms, this paper proposes an innovative UAV trajectory prediction method QCNet-3D, which can predict the future trajectory of the target UAV and provide the corresponding possibility. Firstly, the UAV trajectory prediction is modeled based on the mixture of Laplace distributions, and the UAV's kinetic equations are employed to construct the UAV trajectory prediction dataset(UAVTP dataset), ensuring high reliability. Secondly, two improvement methods are proposed on the basis of QCNet: multi-scale Fourier mapping and three-dimensional adaptation. The ablation study shows that the improvement methods have reduced the minimum average displacement error, minimum final displacement error, and missing rate by 55.4%, 54.3%, and 68.1% respectively. Finally, QCNet-3D is proposed based on the two improvement methods, and the simulation experiment confirm the proposed algorithm's capability to predict both simple and complex UAV maneuvers, offering the possibility for each predicted trajectory under various prediction future steps and output modes.
基金supported by the National Key R&D Program of China(No.2021YFB3300602)。
文摘Within-Visual-Range(WVR)air combat is a highly dynamic and uncertain domain where effective strategies require intelligent and adaptive decision-making.Traditional approaches,including rule-based methods and conventional Reinforcement Learning(RL)algorithms,often focus on maximizing engagement outcomes through direct combat superiority.However,these methods overlook alternative tactics,such as inducing adversaries to crash,which can achieve decisive victories with lower risk and cost.This study proposes Alpha Crash,a novel distributional-rein forcement-learning-based agent specifically designed to defeat opponents by leveraging crash induction strategies.The approach integrates an improved QR-DQN framework to address uncertainties and adversarial tactics,incorporating advanced pilot experience into its reward functions.Extensive simulations reveal Alpha Crash's robust performance,achieving a 91.2%win rate across diverse scenarios by effectively guiding opponents into critical errors.Visualization and altitude analyses illustrate the agent's three-stage crash induction strategies that exploit adversaries'vulnerabilities.These findings underscore Alpha Crash's potential to enhance autonomous decision-making and strategic innovation in real-world air combat applications.
文摘Policy training against diverse opponents remains a challenge when using Multi-Agent Reinforcement Learning(MARL)in multiple Unmanned Combat Aerial Vehicle(UCAV)air combat scenarios.In view of this,this paper proposes a novel Dominant and Non-dominant strategy sample selection(DoNot)mechanism and a Local Observation Enhanced Multi-Agent Proximal Policy Optimization(LOE-MAPPO)algorithm to train the multi-UCAV air combat policy and improve its generalization.Specifically,the LOE-MAPPO algorithm adopts a mixed state that concatenates the global state and individual agent's local observation to enable efficient value function learning in multi-UCAV air combat.The DoNot mechanism classifies opponents into dominant or non-dominant strategy opponents,and samples from easier to more challenging opponents to form an adaptive training curriculum.Empirical results demonstrate that the proposed LOE-MAPPO algorithm outperforms baseline MARL algorithms in multi-UCAV air combat scenarios,and the DoNot mechanism leads to stronger policy generalization when facing diverse opponents.The results pave the way for the fast generation of cooperative strategies for air combat agents with MARLalgorithms.
基金supported by the National Natural Science Foundation of China(7200120972231011+2 种基金72071206)the Science and Technology Innovative Research Team in Higher Educational Institutions of Hunan Province(2020RC4046)the Science Foundation for Outstanding Youth Scholars of Hunan Province(2022JJ20047).
文摘The rapid development of military technology has prompted different types of equipment to break the limits of operational domains and emerged through complex interactions to form a vast combat system of systems(CSoS),which can be abstracted as a heterogeneous combat network(HCN).It is of great military significance to study the disintegration strategy of combat networks to achieve the breakdown of the enemy’s CSoS.To this end,this paper proposes an integrated framework called HCN disintegration based on double deep Q-learning(HCN-DDQL).Firstly,the enemy’s CSoS is abstracted as an HCN,and an evaluation index based on the capability and attack costs of nodes is proposed.Meanwhile,a mathematical optimization model for HCN disintegration is established.Secondly,the learning environment and double deep Q-network model of HCN-DDQL are established to train the HCN’s disintegration strategy.Then,based on the learned HCN-DDQL model,an algorithm for calculating the HCN’s optimal disintegration strategy under different states is proposed.Finally,a case study is used to demonstrate the reliability and effectiveness of HCNDDQL,and the results demonstrate that HCN-DDQL can disintegrate HCNs more effectively than baseline methods.
基金co-supported by the National Natural Science Foundation of China(No.91852115)。
文摘The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots,making rapid,accurate decision-making challenging.While reinforcement learning(RL)has shown promise in this domain,the existing methods often lack strategic depth and generalization in complex,high-dimensional environments.To address these limitations,this paper proposes an optimized self-play method enhanced by advancements in fighter modeling,neural network design,and algorithmic frameworks.This study employs a six-degree-of-freedom(6-DOF)F-16 fighter model based on open-source aerodynamic data,featuring airborne equipment and a realistic visual simulation platform,unlike traditional 3-DOF models.To capture temporal dynamics,Long Short-Term Memory(LSTM)layers are integrated into the neural network,complemented by delayed input stacking.The RL environment incorporates expert strategies,curiositydriven rewards,and curriculum learning to improve adaptability and strategic decision-making.Experimental results demonstrate that the proposed approach achieves a winning rate exceeding90%against classical single-agent methods.Additionally,through enhanced 3D visual platforms,we conducted human-agent confrontation experiments,where the agent attained an average winning rate of over 75%.The agent's maneuver trajectories closely align with human pilot strategies,showcasing its potential in decision-making and pilot training applications.This study highlights the effectiveness of integrating advanced modeling and self-play techniques in developing robust air combat decision-making systems.
文摘During its interaction with modern sports,traditional Wushu has faced increasing doubts about its combat effectiveness,raising concerns about its cultural identity.How traditional Wushu is understood as a combat art not only helps define its cultural essence but also carries important implications for its long-term development.It is an objective fact that combat represents the practical manifestation of traditional Wushu in history.Combat reflects similarities among traditional Wushu forms that emerged throughout history.Combat reflects the historical law governing the evolution of traditional Wushu and represents an abstraction of repetitive phenomena in traditional Wushu.A correct understanding of this objectivity,these similarities,and this repeatability is conducive to promoting and carrying forward traditional Wushu,thereby facilitating an objective analysis of differences among different traditional Wushu forms and the discovery of their evolution paradigm.In the contemporary context,it is essential for traditional Wushu to emphasize its distinctive cultural roots,thereby facilitating creative transformation and innovative development.
文摘To extract and display the significant information of combat systems,this paper introduces the methodology of functional cartography into combat networks and proposes an integrated framework named“functional cartography of heterogeneous combat networks based on the operational chain”(FCBOC).In this framework,a functional module detection algorithm named operational chain-based label propagation algorithm(OCLPA),which considers the cooperation and interactions among combat entities and can thus naturally tackle network heterogeneity,is proposed to identify the functional modules of the network.Then,the nodes and their modules are classified into different roles according to their properties.A case study shows that FCBOC can provide a simplified description of disorderly information of combat networks and enable us to identify their functional and structural network characteristics.The results provide useful information to help commanders make precise and accurate decisions regarding the protection,disintegration or optimization of combat networks.Three algorithms are also compared with OCLPA to show that FCBOC can most effectively find functional modules with practical meaning.
基金National Natural Science Foundation of China(62006193,62103338)Aeronautical Science Foundation of China(2022Z023053001)+1 种基金Key Research and Development Program of Shaanxi Province(2024GX-YBXM-115)Fundamental Research Funds for the Central Universities(D5000230150)。
文摘Beyond-visual-range(BVR)air combat threat assessment has attracted wide attention as the support of situation awareness and autonomous decision-making.However,the traditional threat assessment method is flawed in its failure to consider the intention and event of the target,resulting in inaccurate assessment results.In view of this,an integrated threat assessment method is proposed to address the existing problems,such as overly subjective determination of index weight and imbalance of situation.The process and characteristics of BVR air combat are analyzed to establish a threat assessment model in terms of target intention,event,situation,and capability.On this basis,a distributed weight-solving algorithm is proposed to determine index and attribute weight respectively.Then,variable weight and game theory are introduced to effectively deal with the situation imbalance and achieve the combination of subjective and objective.The performance of the model and algorithm is evaluated through multiple simulation experiments.The assessment results demonstrate the accuracy of the proposed method in BVR air combat,indicating its potential practical significance in real air combat scenarios.
文摘随着人工智能技术的快速发展,大型语言模型(LLMs)在认知、推理与决策等方面表现出强大能力,为兵棋推演系统的智能化升级提供了新的机遇。针对单一智能决策方法在态势理解、智能决策和对抗能力不足等问题。论文提出了一种基于大模型的海上作战兵棋推演智能决策框架,该框架融合感知理解、知识工程、透明推理与分层记忆等关键技术,构建了多层次的“感知-理解-决策-执行”全流程架构。创新性地设计了面向作战任务的战术知识库、基于思维链的推理机制和记忆库,实现了对复杂战场环境下的智能辅助任务规划。框架在Command Modern Operations(CMO)仿真平台上进行实验,针对典型海上作战任务开展了对抗推演。结果表明,论文所提出的框架在任务完成度、资源利用效率、决策合理性与可解释性等方面较传统方法显著提升,具备良好的实战应用价值。