The American critic argues that the 2025 U.S.National Security Strategy,with its isolationist and confrontational approach towards allies and China,is a desperate fiction that undermines genuine American prosperity an...The American critic argues that the 2025 U.S.National Security Strategy,with its isolationist and confrontational approach towards allies and China,is a desperate fiction that undermines genuine American prosperity and security.展开更多
Cooperative multi-UAV search requires jointly optimizing wide-area coverage,rapid target discovery,and endurance under sensing and motion constraints.Resolving this coupling enables scalable coordination with high dat...Cooperative multi-UAV search requires jointly optimizing wide-area coverage,rapid target discovery,and endurance under sensing and motion constraints.Resolving this coupling enables scalable coordination with high data efficiency and mission reliability.We formulate this problem as a discounted Markov decision process on an occupancy grid with a cellwise Bayesian belief update,yielding a Markov state that couples agent poses with a probabilistic target field.On this belief–MDP we introduce a segment-conditioned latent-intent framework,in which a discrete intent head selects a latent skill every K steps and an intra-segment GRU policy generates per-step control conditioned on the fixed intent;both components are trained end-to-end with proximal updates under a centralized critic.On the 50×50 grid,coverage and discovery convergence times are reduced by up to 48%and 40%relative to a flat actor-critic benchmark,and the aggregated convergence metric improves by about 12%compared with a stateof-the-art hierarchical method.Qualitative analyses further reveal stable spatial sectorization,low path overlap,and fuel-aware patrolling,indicating that segment-conditioned latent intents provide an effective and scalable mechanism for coordinated multi-UAV search.展开更多
The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to...The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified.展开更多
According to the requirements of the live-virtual-constructive(LVC)tactical confrontation(TC)on the virtual entity(VE)decision model of graded combat capability,diversified actions,real-time decision-making,and genera...According to the requirements of the live-virtual-constructive(LVC)tactical confrontation(TC)on the virtual entity(VE)decision model of graded combat capability,diversified actions,real-time decision-making,and generalization for the enemy,the confrontation process is modeled as a zero-sum stochastic game(ZSG).By introducing the theory of dynamic relative power potential field,the problem of reward sparsity in the model can be solved.By reward shaping,the problem of credit assignment between agents can be solved.Based on the idea of meta-learning,an extensible multi-agent deep reinforcement learning(EMADRL)framework and solving method is proposed to improve the effectiveness and efficiency of model solving.Experiments show that the model meets the requirements well and the algorithm learning efficiency is high.展开更多
In this paper, as a new contribution to the tensor-centric warfare (TCW) series [1] [2] [3] [4], we extend the kinetic TCW-framework to include non-kinetic effects, by addressing a general systems confrontation [5], w...In this paper, as a new contribution to the tensor-centric warfare (TCW) series [1] [2] [3] [4], we extend the kinetic TCW-framework to include non-kinetic effects, by addressing a general systems confrontation [5], which is waged not only in the traditional physical Air-Land-Sea domains, but also simultaneously across multiple non-physical domains, including cyberspace and social networks. Upon this basis, this paper attempts to address a more general analytical scenario using rigorous topological methods to introduce a two-level topological representation of modern armed conflict;in doing so, it extends from the traditional red-blue model of conflict to a red-blue-green model, where green represents various neutral elements as active factions;indeed, green can effectively decide the outcomes from red-blue conflict. System confrontations at various stages of the scenario will be defined by the non-equilibrium phase transitions which are superficially characterized by sudden entropy growth. These will be shown to have the underlying topology changes of the systems-battlespace. The two-level topological analysis of the systems-battlespace is utilized to address the question of topology changes in the combined battlespace. Once an intuitive analysis of the combined battlespace topology is performed, a rigorous topological analysis follows using (co)homological invariants of the combined systems-battlespace manifold.展开更多
The Short Happy Life of Francis Macomber is a quintessential Hemingway tale of one man's attempt to overcome an in ternal struggle by mastering the external world. Francis Macomber discovers his own bravery and st...The Short Happy Life of Francis Macomber is a quintessential Hemingway tale of one man's attempt to overcome an in ternal struggle by mastering the external world. Francis Macomber discovers his own bravery and strength when he ignores his self-consciousness and relies on instinct. This essay will examine Hemingway's code and how it confronts nada thus analyze Macomber's change from innocent to suffering to aware.展开更多
In The Great Gatsby, Fitzgerald depicts the conflicts and contradictions between men and women about society, family, love, and money, literally mirroring the patriarchal society constantly challenged by feminism in t...In The Great Gatsby, Fitzgerald depicts the conflicts and contradictions between men and women about society, family, love, and money, literally mirroring the patriarchal society constantly challenged by feminism in the 1920s of America. This paper intends to compare the features of masculinism and feminism in three aspects: gender, society, and morality. Different identifications of gender role between men and women lead to female protests against male superiority and pursuits of individual liberation. Meanwhile, male unshaken egotism and gradually expanded individualism of women enable them both in lack of sound moral standards. But compared with the female, male moral pride drives them with much more proper moral judge, which reflects Fitzgerald's support of the masculine society. Probing into the confrontation between masculinism and feminism, it is beneficial for further study on how to achieve equal coexistence and harmony between men and women.展开更多
This paper analyzes the characteristics of Interact space and confrontation, discussed on the main technology of network space attack and defense confrontation. The paper presents the realization scheme of network spa...This paper analyzes the characteristics of Interact space and confrontation, discussed on the main technology of network space attack and defense confrontation. The paper presents the realization scheme of network space attack defense confrontation system, and analyzes its feasibility. The technology and the system can provide technical support for the system in the network space of our country development, and safeguard security of network space in China, promote the development of the network space security industry of China, it plays an important role and significance to speed up China' s independent controllable security products development.展开更多
In multiple Unmanned Aerial Vehicles(UAV)systems,achieving efficient navigation is essential for executing complex tasks and enhancing autonomy.Traditional navigation methods depend on predefined control strategies an...In multiple Unmanned Aerial Vehicles(UAV)systems,achieving efficient navigation is essential for executing complex tasks and enhancing autonomy.Traditional navigation methods depend on predefined control strategies and trajectory planning and often perform poorly in complex environments.To improve the UAV-environment interaction efficiency,this study proposes a multi-UAV integrated navigation algorithm based on Deep Reinforcement Learning(DRL).This algorithm integrates the Inertial Navigation System(INS),Global Navigation Satellite System(GNSS),and Visual Navigation System(VNS)for comprehensive information fusion.Specifically,an improved multi-UAV integrated navigation algorithm called Information Fusion with MultiAgent Deep Deterministic Policy Gradient(IF-MADDPG)was developed.This algorithm enables UAVs to learn collaboratively and optimize their flight trajectories in real time.Through simulations and experiments,test scenarios in GNSS-denied environments were constructed to evaluate the effectiveness of the algorithm.The experimental results demonstrate that the IF-MADDPG algorithm significantly enhances the collaborative navigation capabilities of multiple UAVs in formation maintenance and GNSS-denied environments.Additionally,it has advantages in terms of mission completion time.This study provides a novel approach for efficient collaboration in multi-UAV systems,which significantly improves the robustness and adaptability of navigation systems.展开更多
Unmanned aerial vehicles(UAVs)are widely used in situations with uncertain and risky areas lacking network coverage.In natural disasters,timely delivery of first aid supplies is crucial.Current UAVs face risks such as...Unmanned aerial vehicles(UAVs)are widely used in situations with uncertain and risky areas lacking network coverage.In natural disasters,timely delivery of first aid supplies is crucial.Current UAVs face risks such as crashing into birds or unexpected structures.Airdrop systems with parachutes risk dispersing payloads away from target locations.The objective here is to use multiple UAVs to distribute payloads cooperatively to assigned locations.The civil defense department must balance coverage,accurate landing,and flight safety while considering battery power and capability.Deep Q-network(DQN)models are commonly used in multi-UAV path planning to effectively represent the surroundings and action spaces.Earlier strategies focused on advanced DQNs for UAV path planning in different configurations,but rarely addressed non-cooperative scenarios and disaster environments.This paper introduces a new DQN framework to tackle challenges in disaster environments.It considers unforeseen structures and birds that could cause UAV crashes and assumes urgent landing zones and winch-based airdrop systems for precise delivery and return.A new DQN model is developed,which incorporates the battery life,safe flying distance between UAVs,and remaining delivery points to encode surrounding hazards into the state space and Q-networks.Additionally,a unique reward system is created to improve UAV action sequences for better delivery coverage and safe landings.The experimental results demonstrate that multi-UAV first aid delivery in disaster environments can achieve advanced performance.展开更多
Aiming at the problem of low convergence efficiency of traditional multi-UAV path planning algorithms in unknown complex environments,this paper proposes a deep reinforcement learning algorithm incorporating the atten...Aiming at the problem of low convergence efficiency of traditional multi-UAV path planning algorithms in unknown complex environments,this paper proposes a deep reinforcement learning algorithm incorporating the attention mechanism.The method is based on the Soft Actor-Critic(SAC)framework,which introduces a multi-attention mechanism in the Critic network,dynamically learns the dependency relationship between intelligences,and realizes key information screening and conflict avoidance.An environment with multiple random obstacles is designed to simulate complex emergent situations.The results show that the proposed algorithm significantly improves the mission success rate and average reward,significantly extends the survival time and exploration range of the UAVs,and verifies the effectiveness of the attention mechanism in enhancing the efficiency,robustness,and long-term planning capability of multi-UAV collaboration,as compared to the baseline method that does not use attention.展开更多
This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing...This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing in the cooperative pursuit tasks of multiple unmanned aerial vehicles(UAVs).Traditional algorithms often fail to effectively identify critical cooperative relationships in such tasks,leading to low capture efficiency and a significant decline in performance when the scale expands.To tackle these issues,based on the proximal policy optimization(PPO)algorithm,MA2PPO adopts the centralized training with decentralized execution(CTDE)framework and introduces a dynamic decoupling mechanism,that is,sharing the multi-head attention(MHA)mechanism for critics during centralized training to solve the credit assignment problem.This method enables the pursuers to identify highly correlated interactions with their teammates,effectively eliminate irrelevant and weakly relevant interactions,and decompose large-scale cooperation problems into decoupled sub-problems,thereby enhancing the collaborative efficiency and policy stability among multiple agents.Furthermore,a reward function has been devised to facilitate the pursuers to encircle the escapee by combining a formation reward with a distance reward,which incentivizes UAVs to develop sophisticated cooperative pursuit strategies.Experimental results demonstrate the effectiveness of the proposed algorithm in achieving multi-UAV cooperative pursuit and inducing diverse cooperative pursuit behaviors among UAVs.Moreover,experiments on scalability have demonstrated that the algorithm is suitable for large-scale multi-UAV systems.展开更多
This study introduces a novel algorithm known as the dung beetle optimization algorithm based on bounded reflection optimization andmulti-strategy fusion(BFDBO),which is designed to tackle the complexities associated ...This study introduces a novel algorithm known as the dung beetle optimization algorithm based on bounded reflection optimization andmulti-strategy fusion(BFDBO),which is designed to tackle the complexities associated with multi-UAV collaborative trajectory planning in intricate battlefield environments.Initially,a collaborative planning cost function for the multi-UAV system is formulated,thereby converting the trajectory planning challenge into an optimization problem.Building on the foundational dung beetle optimization(DBO)algorithm,BFDBO incorporates three significant innovations:a boundary reflection mechanism,an adaptive mixed exploration strategy,and a dynamic multi-scale mutation strategy.These enhancements are intended to optimize the equilibrium between local exploration and global exploitation,facilitating the discovery of globally optimal trajectories thatminimize the cost function.Numerical simulations utilizing the CEC2022 benchmark function indicate that all three enhancements of BFDBOpositively influence its performance,resulting in accelerated convergence and improved optimization accuracy relative to leading optimization algorithms.In two battlefield scenarios of varying complexities,BFDBO achieved a minimum of a 39% reduction in total trajectory planning costs when compared to DBO and three other highperformance variants,while also demonstrating superior average runtime.This evidence underscores the effectiveness and applicability of BFDBO in practical,real-world contexts.展开更多
Multiple UAVs cooperative target search has been widely used in various environments,such as emergency rescue and traffic monitoring.However,uncertain communication network among UAVs exhibits unstable links and rapid...Multiple UAVs cooperative target search has been widely used in various environments,such as emergency rescue and traffic monitoring.However,uncertain communication network among UAVs exhibits unstable links and rapid topological fluctuations due to mission complexity and unpredictable environmental states.This limitation hinders timely information sharing and insightful path decisions for UAVs,resulting in inefficient or even failed collaborative search.Aiming at this issue,this paper proposes a multi-UAV cooperative search strategy by developing a real-time trajectory decision that incorporates autonomous connectivity to reinforce multi-UAV collaboration and achieve search acceleration in uncertain search environments.Specifically,an autonomous connectivity strategy based on node cognitive information and network states is introduced to enable effective message transmission and adapt to the dynamic network environment.Based on the fused information,we formalize the trajectory planning as a multiobjective optimization problem by jointly considering search performance and UAV energy harnessing.A multi-agent deep reinforcement learning based algorithm is proposed to solve it,where the reward-guided real-time path is determined to achieve an energyefficient search.Finally,extensive experimental results show that the proposed algorithm outperforms existing works in terms of average search rate and coverage rate with reduced energy consumption under uncertain search environments.展开更多
文摘The American critic argues that the 2025 U.S.National Security Strategy,with its isolationist and confrontational approach towards allies and China,is a desperate fiction that undermines genuine American prosperity and security.
文摘Cooperative multi-UAV search requires jointly optimizing wide-area coverage,rapid target discovery,and endurance under sensing and motion constraints.Resolving this coupling enables scalable coordination with high data efficiency and mission reliability.We formulate this problem as a discounted Markov decision process on an occupancy grid with a cellwise Bayesian belief update,yielding a Markov state that couples agent poses with a probabilistic target field.On this belief–MDP we introduce a segment-conditioned latent-intent framework,in which a discrete intent head selects a latent skill every K steps and an intra-segment GRU policy generates per-step control conditioned on the fixed intent;both components are trained end-to-end with proximal updates under a centralized critic.On the 50×50 grid,coverage and discovery convergence times are reduced by up to 48%and 40%relative to a flat actor-critic benchmark,and the aggregated convergence metric improves by about 12%compared with a stateof-the-art hierarchical method.Qualitative analyses further reveal stable spatial sectorization,low path overlap,and fuel-aware patrolling,indicating that segment-conditioned latent intents provide an effective and scalable mechanism for coordinated multi-UAV search.
基金the Project of National Natural Science Foundation of China(Grant No.62106283)the Project of National Natural Science Foundation of China(Grant No.72001214)to provide fund for conducting experimentsthe Project of Natural Science Foundation of Shaanxi Province(Grant No.2020JQ-484)。
文摘The scale of ground-to-air confrontation task assignments is large and needs to deal with many concurrent task assignments and random events.Aiming at the problems where existing task assignment methods are applied to ground-to-air confrontation,there is low efficiency in dealing with complex tasks,and there are interactive conflicts in multiagent systems.This study proposes a multiagent architecture based on a one-general agent with multiple narrow agents(OGMN)to reduce task assignment conflicts.Considering the slow speed of traditional dynamic task assignment algorithms,this paper proposes the proximal policy optimization for task assignment of general and narrow agents(PPOTAGNA)algorithm.The algorithm based on the idea of the optimal assignment strategy algorithm and combined with the training framework of deep reinforcement learning(DRL)adds a multihead attention mechanism and a stage reward mechanism to the bilateral band clipping PPO algorithm to solve the problem of low training efficiency.Finally,simulation experiments are carried out in the digital battlefield.The multiagent architecture based on OGMN combined with the PPO-TAGNA algorithm can obtain higher rewards faster and has a higher win ratio.By analyzing agent behavior,the efficiency,superiority and rationality of resource utilization of this method are verified.
基金supported by the Military Scentific Research Project(41405030302,41401020301).
文摘According to the requirements of the live-virtual-constructive(LVC)tactical confrontation(TC)on the virtual entity(VE)decision model of graded combat capability,diversified actions,real-time decision-making,and generalization for the enemy,the confrontation process is modeled as a zero-sum stochastic game(ZSG).By introducing the theory of dynamic relative power potential field,the problem of reward sparsity in the model can be solved.By reward shaping,the problem of credit assignment between agents can be solved.Based on the idea of meta-learning,an extensible multi-agent deep reinforcement learning(EMADRL)framework and solving method is proposed to improve the effectiveness and efficiency of model solving.Experiments show that the model meets the requirements well and the algorithm learning efficiency is high.
文摘In this paper, as a new contribution to the tensor-centric warfare (TCW) series [1] [2] [3] [4], we extend the kinetic TCW-framework to include non-kinetic effects, by addressing a general systems confrontation [5], which is waged not only in the traditional physical Air-Land-Sea domains, but also simultaneously across multiple non-physical domains, including cyberspace and social networks. Upon this basis, this paper attempts to address a more general analytical scenario using rigorous topological methods to introduce a two-level topological representation of modern armed conflict;in doing so, it extends from the traditional red-blue model of conflict to a red-blue-green model, where green represents various neutral elements as active factions;indeed, green can effectively decide the outcomes from red-blue conflict. System confrontations at various stages of the scenario will be defined by the non-equilibrium phase transitions which are superficially characterized by sudden entropy growth. These will be shown to have the underlying topology changes of the systems-battlespace. The two-level topological analysis of the systems-battlespace is utilized to address the question of topology changes in the combined battlespace. Once an intuitive analysis of the combined battlespace topology is performed, a rigorous topological analysis follows using (co)homological invariants of the combined systems-battlespace manifold.
文摘The Short Happy Life of Francis Macomber is a quintessential Hemingway tale of one man's attempt to overcome an in ternal struggle by mastering the external world. Francis Macomber discovers his own bravery and strength when he ignores his self-consciousness and relies on instinct. This essay will examine Hemingway's code and how it confronts nada thus analyze Macomber's change from innocent to suffering to aware.
文摘In The Great Gatsby, Fitzgerald depicts the conflicts and contradictions between men and women about society, family, love, and money, literally mirroring the patriarchal society constantly challenged by feminism in the 1920s of America. This paper intends to compare the features of masculinism and feminism in three aspects: gender, society, and morality. Different identifications of gender role between men and women lead to female protests against male superiority and pursuits of individual liberation. Meanwhile, male unshaken egotism and gradually expanded individualism of women enable them both in lack of sound moral standards. But compared with the female, male moral pride drives them with much more proper moral judge, which reflects Fitzgerald's support of the masculine society. Probing into the confrontation between masculinism and feminism, it is beneficial for further study on how to achieve equal coexistence and harmony between men and women.
文摘This paper analyzes the characteristics of Interact space and confrontation, discussed on the main technology of network space attack and defense confrontation. The paper presents the realization scheme of network space attack defense confrontation system, and analyzes its feasibility. The technology and the system can provide technical support for the system in the network space of our country development, and safeguard security of network space in China, promote the development of the network space security industry of China, it plays an important role and significance to speed up China' s independent controllable security products development.
基金co-supported by the National Natural Science Foundation of China(Nos.92371201 and 52192633)the Natural Science Foundation of Shaanxi Province of China(No.2022JC-03)the Aeronautical Science Foundation of China(No.ASFC-20220019070002)。
文摘In multiple Unmanned Aerial Vehicles(UAV)systems,achieving efficient navigation is essential for executing complex tasks and enhancing autonomy.Traditional navigation methods depend on predefined control strategies and trajectory planning and often perform poorly in complex environments.To improve the UAV-environment interaction efficiency,this study proposes a multi-UAV integrated navigation algorithm based on Deep Reinforcement Learning(DRL).This algorithm integrates the Inertial Navigation System(INS),Global Navigation Satellite System(GNSS),and Visual Navigation System(VNS)for comprehensive information fusion.Specifically,an improved multi-UAV integrated navigation algorithm called Information Fusion with MultiAgent Deep Deterministic Policy Gradient(IF-MADDPG)was developed.This algorithm enables UAVs to learn collaboratively and optimize their flight trajectories in real time.Through simulations and experiments,test scenarios in GNSS-denied environments were constructed to evaluate the effectiveness of the algorithm.The experimental results demonstrate that the IF-MADDPG algorithm significantly enhances the collaborative navigation capabilities of multiple UAVs in formation maintenance and GNSS-denied environments.Additionally,it has advantages in terms of mission completion time.This study provides a novel approach for efficient collaboration in multi-UAV systems,which significantly improves the robustness and adaptability of navigation systems.
基金supported by the Committee of Science of the Ministry of Education and Science of the Republic of Kazakhstan under Grant No.249015/0224.
文摘Unmanned aerial vehicles(UAVs)are widely used in situations with uncertain and risky areas lacking network coverage.In natural disasters,timely delivery of first aid supplies is crucial.Current UAVs face risks such as crashing into birds or unexpected structures.Airdrop systems with parachutes risk dispersing payloads away from target locations.The objective here is to use multiple UAVs to distribute payloads cooperatively to assigned locations.The civil defense department must balance coverage,accurate landing,and flight safety while considering battery power and capability.Deep Q-network(DQN)models are commonly used in multi-UAV path planning to effectively represent the surroundings and action spaces.Earlier strategies focused on advanced DQNs for UAV path planning in different configurations,but rarely addressed non-cooperative scenarios and disaster environments.This paper introduces a new DQN framework to tackle challenges in disaster environments.It considers unforeseen structures and birds that could cause UAV crashes and assumes urgent landing zones and winch-based airdrop systems for precise delivery and return.A new DQN model is developed,which incorporates the battery life,safe flying distance between UAVs,and remaining delivery points to encode surrounding hazards into the state space and Q-networks.Additionally,a unique reward system is created to improve UAV action sequences for better delivery coverage and safe landings.The experimental results demonstrate that multi-UAV first aid delivery in disaster environments can achieve advanced performance.
文摘Aiming at the problem of low convergence efficiency of traditional multi-UAV path planning algorithms in unknown complex environments,this paper proposes a deep reinforcement learning algorithm incorporating the attention mechanism.The method is based on the Soft Actor-Critic(SAC)framework,which introduces a multi-attention mechanism in the Critic network,dynamically learns the dependency relationship between intelligences,and realizes key information screening and conflict avoidance.An environment with multiple random obstacles is designed to simulate complex emergent situations.The results show that the proposed algorithm significantly improves the mission success rate and average reward,significantly extends the survival time and exploration range of the UAVs,and verifies the effectiveness of the attention mechanism in enhancing the efficiency,robustness,and long-term planning capability of multi-UAV collaboration,as compared to the baseline method that does not use attention.
基金supported by the National Research and Development Program of China under Grant JCKY2018607C019in part by the Key Laboratory Fund of UAV of Northwestern Polytechnical University under Grant 2021JCJQLB0710L.
文摘This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing in the cooperative pursuit tasks of multiple unmanned aerial vehicles(UAVs).Traditional algorithms often fail to effectively identify critical cooperative relationships in such tasks,leading to low capture efficiency and a significant decline in performance when the scale expands.To tackle these issues,based on the proximal policy optimization(PPO)algorithm,MA2PPO adopts the centralized training with decentralized execution(CTDE)framework and introduces a dynamic decoupling mechanism,that is,sharing the multi-head attention(MHA)mechanism for critics during centralized training to solve the credit assignment problem.This method enables the pursuers to identify highly correlated interactions with their teammates,effectively eliminate irrelevant and weakly relevant interactions,and decompose large-scale cooperation problems into decoupled sub-problems,thereby enhancing the collaborative efficiency and policy stability among multiple agents.Furthermore,a reward function has been devised to facilitate the pursuers to encircle the escapee by combining a formation reward with a distance reward,which incentivizes UAVs to develop sophisticated cooperative pursuit strategies.Experimental results demonstrate the effectiveness of the proposed algorithm in achieving multi-UAV cooperative pursuit and inducing diverse cooperative pursuit behaviors among UAVs.Moreover,experiments on scalability have demonstrated that the algorithm is suitable for large-scale multi-UAV systems.
基金funded by the National Defense Science and Technology Innovation project,grant number ZZKY20223103the Basic Frontier InnovationProject at the Engineering University of PAP,grant number WJY202429+2 种基金the Basic Frontier lnnovation Project at the Engineering University of PAP,grant number WJY202408the Graduate Student Funding Priority Project,grant number JYWJ2024B006Key project of National Social Science Foundation,grant number 2023-SKJJ-A-116.
文摘This study introduces a novel algorithm known as the dung beetle optimization algorithm based on bounded reflection optimization andmulti-strategy fusion(BFDBO),which is designed to tackle the complexities associated with multi-UAV collaborative trajectory planning in intricate battlefield environments.Initially,a collaborative planning cost function for the multi-UAV system is formulated,thereby converting the trajectory planning challenge into an optimization problem.Building on the foundational dung beetle optimization(DBO)algorithm,BFDBO incorporates three significant innovations:a boundary reflection mechanism,an adaptive mixed exploration strategy,and a dynamic multi-scale mutation strategy.These enhancements are intended to optimize the equilibrium between local exploration and global exploitation,facilitating the discovery of globally optimal trajectories thatminimize the cost function.Numerical simulations utilizing the CEC2022 benchmark function indicate that all three enhancements of BFDBOpositively influence its performance,resulting in accelerated convergence and improved optimization accuracy relative to leading optimization algorithms.In two battlefield scenarios of varying complexities,BFDBO achieved a minimum of a 39% reduction in total trajectory planning costs when compared to DBO and three other highperformance variants,while also demonstrating superior average runtime.This evidence underscores the effectiveness and applicability of BFDBO in practical,real-world contexts.
基金supported by National Natural Science Foundation of China(No.62202449 and No.62472410)National Key Research and Development Program of China(2021YFB2900102)。
文摘Multiple UAVs cooperative target search has been widely used in various environments,such as emergency rescue and traffic monitoring.However,uncertain communication network among UAVs exhibits unstable links and rapid topological fluctuations due to mission complexity and unpredictable environmental states.This limitation hinders timely information sharing and insightful path decisions for UAVs,resulting in inefficient or even failed collaborative search.Aiming at this issue,this paper proposes a multi-UAV cooperative search strategy by developing a real-time trajectory decision that incorporates autonomous connectivity to reinforce multi-UAV collaboration and achieve search acceleration in uncertain search environments.Specifically,an autonomous connectivity strategy based on node cognitive information and network states is introduced to enable effective message transmission and adapt to the dynamic network environment.Based on the fused information,we formalize the trajectory planning as a multiobjective optimization problem by jointly considering search performance and UAV energy harnessing.A multi-agent deep reinforcement learning based algorithm is proposed to solve it,where the reward-guided real-time path is determined to achieve an energyefficient search.Finally,extensive experimental results show that the proposed algorithm outperforms existing works in terms of average search rate and coverage rate with reduced energy consumption under uncertain search environments.