The multi-agent theory is introduced and applied in the way to strike the control amount of emergency control according to stability margin, based on which an emergency control strategy of the power system is presente...The multi-agent theory is introduced and applied in the way to strike the control amount of emergency control according to stability margin, based on which an emergency control strategy of the power system is presented. The multi-agent control structure which is put forward in this article has three layers: system agent, areal agent and local agents. System agent sends controlling execution signal to the load-local agent according to the position and the amount of load shedding upload from areal agent;The areal agent judges whether the power system is stable by monitoring and analyzing the maximum relative power angle. In the condition of instability, determines the position of load-shedding, and the optimal amount of load-shedding according to the stability margin based on the corrected transient energy function, upload control amount to system agent;local-generator agent is mainly used for real-time monitoring the power angle of generator sets and uploading it to the areal agency, local-loads agent control load by receiving the control signal from system agent. Simulations on IEEE39 system show that the proposed control strategy improves the system stability.展开更多
With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier...With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics.展开更多
Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune s...Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune system (BIS) according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system (MAS), the vaccine and the danger theory (DT). Moreover a new immune multi-agent model using vaccine based on DT (IMMUVBDT) for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system.展开更多
The multi-agent system is the optimal solution to complex intelligent problems. In accordance with the game theory, the concept of loyalty is introduced to analyze the relationship between agents' individual incom...The multi-agent system is the optimal solution to complex intelligent problems. In accordance with the game theory, the concept of loyalty is introduced to analyze the relationship between agents' individual income and global benefits and build the logical architecture of the multi-agent system. Besides, to verify the feasibility of the method, the cyclic neural network is optimized, the bi-directional coordination network is built as the training network for deep learning, and specific training scenes are simulated as the training background. After a certain number of training iterations, the model can learn simple strategies autonomously. Also,as the training time increases, the complexity of learning strategies rises gradually. Strategies such as obstacle avoidance, firepower distribution and collaborative cover are adopted to demonstrate the achievability of the model. The model is verified to be realizable by the examples of obstacle avoidance, fire distribution and cooperative cover. Under the same resource background, the model exhibits better convergence than other deep learning training networks, and it is not easy to fall into the local endless loop.Furthermore, the ability of the learning strategy is stronger than that of the training model based on rules, which is of great practical values.展开更多
This paper proposes a control strategy called enclosing control.This strategy can be described as follows:the followers design their control inputs based on the state information of neighbor agents and move to specifi...This paper proposes a control strategy called enclosing control.This strategy can be described as follows:the followers design their control inputs based on the state information of neighbor agents and move to specified positions.The convex hull formed by these followers contains the leaders.We use the single-integrator model to describe the dynamics of the agents and proposes a continuous-time control protocol and a sampled-data based protocol for multi-agent systems with stationary leaders with fixed network topology.Then the state differential equations are analyzed to obtain the parameter requirements for the system to achieve convergence.Moreover,the conditions achieving enclosing control are established for both protocols.A special enclosing control with no leader located on the convex hull boundary under the protocols is studied,which can effectively prevent enclosing control failures caused by errors in the system.Moreover,several simulations are proposed to validate theoretical results and compare the differences between the three control protocols.Finally,experimental results on the multi-robot platform are provided to verify the feasibility of the protocol in the physical system.展开更多
Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-...Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.展开更多
This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method...This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system.展开更多
Inspired by the immune theory and multi-agent systems, an immune multi-agent active defense model for network intrusion is established. The concept of immune agent is introduced, and its running mechanism is establish...Inspired by the immune theory and multi-agent systems, an immune multi-agent active defense model for network intrusion is established. The concept of immune agent is introduced, and its running mechanism is established. The method, which uses antibody concentration to quantitatively describe the degree of intrusion danger, is presented. This model implements the multi-layer and distributed active defense mechanism for network intrusion. The experiment results show that this model is a good solution to the network security defense.展开更多
This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary obj...This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary objective is to explore the unknown environments to locate and track targets effectively. To address this problem, we propose a novel Multi-Agent Reinforcement Learning (MARL) method based on Graph Neural Network (GNN). Firstly, a method is introduced for encoding continuous-space multi-UAV problem data into spatial graphs which establish essential relationships among agents, obstacles, and targets. Secondly, a Graph AttenTion network (GAT) model is presented, which focuses exclusively on adjacent nodes, learns attention weights adaptively and allows agents to better process information in dynamic environments. Reward functions are specifically designed to tackle exploration challenges in environments with sparse rewards. By introducing a framework that integrates centralized training and distributed execution, the advancement of models is facilitated. Simulation results show that the proposed method outperforms the existing MARL method in search rate and tracking performance with less collisions. The experiments show that the proposed method can be extended to applications with a larger number of agents, which provides a potential solution to the challenging problem of multi-UAV autonomous tracking in dynamic unknown environments.展开更多
Consensus theory and noncooperative game theory respectively deal with cooperative and noncooperative interactions among multiple players/agents. They provide a natural framework for road pricing design, since each mo...Consensus theory and noncooperative game theory respectively deal with cooperative and noncooperative interactions among multiple players/agents. They provide a natural framework for road pricing design, since each motorist may myopically optimize his or her own utility as a function of road price and collectively communicate with his or her friends and neighbors on traffic situation at the same time. This paper considers the road pricing design by using game theory and consensus theory. For the case where a system supervisor broadcasts information on the overall system to each agent, we present a variant of standard fictitious play called average strategy fictitious play(ASFP) for large-scale repeated congestion games.Only a weighted running average of all other players actions is assumed to be available to each player. The ASFP reduces the burden of both information gathering and information processing for each player. Compared to the joint strategy fictitious play(JSFP) studied in the literature, the updating process of utility functions for each player is avoided. We prove that there exists at least one pure strategy Nash equilibrium for the congestion game under investigation, and the players actions generated by the ASFP with inertia(players reluctance to change their previous actions) converge to a Nash equilibrium almost surely. For the case without broadcasting, a consensus protocol is introduced for individual agents to estimate the percentage of players choosing each resource, and the convergence property of players action profile is still ensured. The results are applied to road pricing design to achieve socially local optimal trip timing. Simulation results are provided based on the real traffic data for the Singapore case study.展开更多
Due to the characteristics of line-of-sight(LoS)communication in unmanned aerial vehicle(UAV)networks,these systems are highly susceptible to eavesdropping and surveillance.To effectively address the security concerns...Due to the characteristics of line-of-sight(LoS)communication in unmanned aerial vehicle(UAV)networks,these systems are highly susceptible to eavesdropping and surveillance.To effectively address the security concerns in UAV communication,covert communication methods have been adopted.This paper explores the joint optimization problem of trajectory and transmission power in a multi-hop UAV relay covert communication system.Considering the communication covertness,power constraints,and trajectory limitations,an algorithm based on multi-agent proximal policy optimization(MAPPO),named covert-MAPPO(C-MAPPO),is proposed.The proposed method leverages the strengths of both optimization algorithms and reinforcement learning to analyze and make joint decisions on the transmission power and flight trajectory strategies for UAVs to achieve cooperation.Simulation results demonstrate that the proposed method can maximize the system throughput while satisfying covertness constraints,and it outperforms benchmark algorithms in terms of system throughput and reward convergence speed.展开更多
Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,...Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,the issue of optimal defense timing remains underexplored.Current default approaches—periodic or overly frequent MTD triggers—lead to suboptimal trade-offs among system security,performance,and cost.The timing of MTD strategy activation critically impacts both defensive efficacy and operational overhead,yet existing frameworks inadequately address this temporal dimension.To bridge this gap,this paper proposes a Stackelberg-FlipIt game model that formalizes asymmetric cyber conflicts as alternating control over attack surfaces,thereby capturing the dynamic security state evolution of MTD systems.We introduce a belief factor to quantify information asymmetry during adversarial interactions,enhancing the precision of MTD trigger timing.Leveraging this game-theoretic foundation,we employMulti-Agent Reinforcement Learning(MARL)to derive adaptive temporal strategies,optimized via a novel four-dimensional reward function that holistically balances security,performance,cost,and timing.Experimental validation using IP addressmutation against scanning attacks demonstrates stable strategy convergence and accelerated defense response,significantly improving cybersecurity affordability and effectiveness.展开更多
Multi-agent systems(MASs)have demonstrated significant achievements in a wide range of tasks,leveraging their capacity for coordination and adaptation within complex environments.Moreover,the enhancement of their inte...Multi-agent systems(MASs)have demonstrated significant achievements in a wide range of tasks,leveraging their capacity for coordination and adaptation within complex environments.Moreover,the enhancement of their intelligent functionalities is crucial for tackling increasingly challenging tasks.This goal resonates with a paradigm shift within the artificial intelligence(AI)community,from“internet AI”to“embodied AI”,and the MASs with embodied AI are referred to as embodied multi-agent systems(EMASs).An EMAS has the potential to acquire generalized competencies through interactions with environments,enabling it to effectively address a variety of tasks and thereby make a substantial contribution to the quest for artificial general intelligence.Despite the burgeoning interest in this domain,a comprehensive review of EMAS has been lacking.This paper offers analysis and synthesis for EMASs from a control perspective,conceptualizing each embodied agent as an entity equipped with a“brain”for decision and a“body”for environmental interaction.System designs are classified into open-loop,closed-loop,and double-loop categories,and EMAS implementations are discussed.Additionally,the current applications and challenges faced by EMASs are summarized and potential avenues for future research in this field are provided.展开更多
This paper mainly focuses on the velocity-constrained consensus problem of discrete-time heterogeneous multi-agent systems with nonconvex constraints and arbitrarily switching topologies,where each agent has first-ord...This paper mainly focuses on the velocity-constrained consensus problem of discrete-time heterogeneous multi-agent systems with nonconvex constraints and arbitrarily switching topologies,where each agent has first-order or second-order dynamics.To solve this problem,a distributed algorithm is proposed based on a contraction operator.By employing the properties of the stochastic matrix,it is shown that all agents’position states could converge to a common point and second-order agents’velocity states could remain in corresponding nonconvex constraint sets and converge to zero as long as the joint communication topology has one directed spanning tree.Finally,the numerical simulation results are provided to verify the effectiveness of the proposed algorithms.展开更多
The development of chassis active safety control technology has improved vehicle stability under extreme conditions.However,its cross-system and multi-functional characteristics make the controller difficult to achiev...The development of chassis active safety control technology has improved vehicle stability under extreme conditions.However,its cross-system and multi-functional characteristics make the controller difficult to achieve cooperative goals.In addition,the chassis system,which has high complexity,numerous subsystems,and strong coupling,will also lead to low computing efficiency and poor control effect of the controller.Therefore,this paper proposes a scenario-driven hybrid distributed model predictive control algorithm with variable control topology.This algorithm divides multiple stability regions based on the vehicle’s β−γ phase plane,forming a mapping relationship between the control structure and the vehicle’s state.A control input fusion mechanism within the transition domain is designed to mitigate the problems of system state oscillation and control input jitter caused by switching control structures.Then,a distributed state-space equation with state coupling and input coupling characteristics is constructed,and a weighted local agent cost function in quadratic programming is derived.Through cost coupling,local agents can coordinate global performance goals.Finally,through Simulink/CarSim joint simulation and hardware-in-the-loop(HIL)test,the proposed algorithm is validated to improve vehicle stability while ensuring trajectory tracking accuracy and has good applicability for multi-objective coordinated control.This paper combines the advantages of distributed MPC and decentralized MPC,achieving a balance between approximating the global optimal results and the solution’s efficiency.展开更多
This paper focuses on the problem of leaderfollowing consensus for nonlinear cascaded multi-agent systems.The control strategies for these systems are transformed into successive control problem schemes for lower-orde...This paper focuses on the problem of leaderfollowing consensus for nonlinear cascaded multi-agent systems.The control strategies for these systems are transformed into successive control problem schemes for lower-order error subsystems.A distributed consensus analysis for the corresponding error systems is conducted by employing recursive methods and virtual controllers,accompanied by a series of Lyapunov functions devised throughout the iterative process,which solves the leaderfollowing consensus problem of a class of nonlinear cascaded multi-agent systems.Specific simulation examples illustrate the effectiveness of the proposed control algorithm.展开更多
In this paper,the distributed optimal formation control problem of heterogeneous Euler–Lagrange multi-agent systems with generic formation constraints and inequality constraints is investigated.Based on the primal–d...In this paper,the distributed optimal formation control problem of heterogeneous Euler–Lagrange multi-agent systems with generic formation constraints and inequality constraints is investigated.Based on the primal–dual dynamics and the adaptive control technique,a distributed optimal formation controller consists of a velocity reference signal generator and a velocity tracking controller is proposed.By using the optimality condition,the relationship between the equilibrium point of the closed-loop system and the optimal solution of the optimization problem is established.Then,by utilizing Lyapunov stability analysis,it is rigorously proved that the optimal formation is reached with the proposed controller.Lastly,simulation examples are provided to substantiate the theoretical results.展开更多
This article investigates the time-varying output group formation tracking control(GFTC)problem for heterogeneous multi-agent systems(HMASs)under switching topologies.The objective is to design a distributed control s...This article investigates the time-varying output group formation tracking control(GFTC)problem for heterogeneous multi-agent systems(HMASs)under switching topologies.The objective is to design a distributed control strategy that enables the outputs of the followers to form the desired sub-formations and track the outputs of the leader in each subgroup.Firstly,novel distributed observers are developed to estimate the states of the leaders under switching topologies.Then,GFTC protocols are designed based on the proposed observers.It is shown that with the distributed protocol,the GFTC problem for HMASs under switching topologies is solved if the average dwell time associated with the switching topologies is larger than a fixed threshold.Finally,an example is provided to illustrate the effectiveness of the proposed control strategy.展开更多
文摘The multi-agent theory is introduced and applied in the way to strike the control amount of emergency control according to stability margin, based on which an emergency control strategy of the power system is presented. The multi-agent control structure which is put forward in this article has three layers: system agent, areal agent and local agents. System agent sends controlling execution signal to the load-local agent according to the position and the amount of load shedding upload from areal agent;The areal agent judges whether the power system is stable by monitoring and analyzing the maximum relative power angle. In the condition of instability, determines the position of load-shedding, and the optimal amount of load-shedding according to the stability margin based on the corrected transient energy function, upload control amount to system agent;local-generator agent is mainly used for real-time monitoring the power angle of generator sets and uploading it to the areal agency, local-loads agent control load by receiving the control signal from system agent. Simulations on IEEE39 system show that the proposed control strategy improves the system stability.
文摘With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics.
文摘Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune system (BIS) according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system (MAS), the vaccine and the danger theory (DT). Moreover a new immune multi-agent model using vaccine based on DT (IMMUVBDT) for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system.
基金supported by the National Natural Science Foundation of China(61503407,61806219,61703426,61876189,61703412)the China Postdoctoral Science Foundation(2016 M602996)。
文摘The multi-agent system is the optimal solution to complex intelligent problems. In accordance with the game theory, the concept of loyalty is introduced to analyze the relationship between agents' individual income and global benefits and build the logical architecture of the multi-agent system. Besides, to verify the feasibility of the method, the cyclic neural network is optimized, the bi-directional coordination network is built as the training network for deep learning, and specific training scenes are simulated as the training background. After a certain number of training iterations, the model can learn simple strategies autonomously. Also,as the training time increases, the complexity of learning strategies rises gradually. Strategies such as obstacle avoidance, firepower distribution and collaborative cover are adopted to demonstrate the achievability of the model. The model is verified to be realizable by the examples of obstacle avoidance, fire distribution and cooperative cover. Under the same resource background, the model exhibits better convergence than other deep learning training networks, and it is not easy to fall into the local endless loop.Furthermore, the ability of the learning strategy is stronger than that of the training model based on rules, which is of great practical values.
基金supported in part by the National Natural Science Foundation of China(61703411,61834004)the Natural Science Foundation of Shaanxi Province(2017JM6016)。
文摘This paper proposes a control strategy called enclosing control.This strategy can be described as follows:the followers design their control inputs based on the state information of neighbor agents and move to specified positions.The convex hull formed by these followers contains the leaders.We use the single-integrator model to describe the dynamics of the agents and proposes a continuous-time control protocol and a sampled-data based protocol for multi-agent systems with stationary leaders with fixed network topology.Then the state differential equations are analyzed to obtain the parameter requirements for the system to achieve convergence.Moreover,the conditions achieving enclosing control are established for both protocols.A special enclosing control with no leader located on the convex hull boundary under the protocols is studied,which can effectively prevent enclosing control failures caused by errors in the system.Moreover,several simulations are proposed to validate theoretical results and compare the differences between the three control protocols.Finally,experimental results on the multi-robot platform are provided to verify the feasibility of the protocol in the physical system.
基金The National Natural Science Foundation of China(62136008,62293541)The Beijing Natural Science Foundation(4232056)The Beijing Nova Program(20240484514).
文摘Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.
基金The National Natural Science Foundation of China(W2431048)The Science and Technology Research Program of Chongqing Municipal Education Commission,China(KJZDK202300807)The Chongqing Natural Science Foundation,China(CSTB2024NSCQQCXMX0052).
文摘This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system.
基金Supported by the National Natural Science Foundation of China (60373110, 60573130, 60502011)
文摘Inspired by the immune theory and multi-agent systems, an immune multi-agent active defense model for network intrusion is established. The concept of immune agent is introduced, and its running mechanism is established. The method, which uses antibody concentration to quantitatively describe the degree of intrusion danger, is presented. This model implements the multi-layer and distributed active defense mechanism for network intrusion. The experiment results show that this model is a good solution to the network security defense.
基金supported by the National Natural Science Foundation of China(Nos.12272104,U22B2013).
文摘This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary objective is to explore the unknown environments to locate and track targets effectively. To address this problem, we propose a novel Multi-Agent Reinforcement Learning (MARL) method based on Graph Neural Network (GNN). Firstly, a method is introduced for encoding continuous-space multi-UAV problem data into spatial graphs which establish essential relationships among agents, obstacles, and targets. Secondly, a Graph AttenTion network (GAT) model is presented, which focuses exclusively on adjacent nodes, learns attention weights adaptively and allows agents to better process information in dynamic environments. Reward functions are specifically designed to tackle exploration challenges in environments with sparse rewards. By introducing a framework that integrates centralized training and distributed execution, the advancement of models is facilitated. Simulation results show that the proposed method outperforms the existing MARL method in search rate and tracking performance with less collisions. The experiments show that the proposed method can be extended to applications with a larger number of agents, which provides a potential solution to the challenging problem of multi-UAV autonomous tracking in dynamic unknown environments.
文摘Consensus theory and noncooperative game theory respectively deal with cooperative and noncooperative interactions among multiple players/agents. They provide a natural framework for road pricing design, since each motorist may myopically optimize his or her own utility as a function of road price and collectively communicate with his or her friends and neighbors on traffic situation at the same time. This paper considers the road pricing design by using game theory and consensus theory. For the case where a system supervisor broadcasts information on the overall system to each agent, we present a variant of standard fictitious play called average strategy fictitious play(ASFP) for large-scale repeated congestion games.Only a weighted running average of all other players actions is assumed to be available to each player. The ASFP reduces the burden of both information gathering and information processing for each player. Compared to the joint strategy fictitious play(JSFP) studied in the literature, the updating process of utility functions for each player is avoided. We prove that there exists at least one pure strategy Nash equilibrium for the congestion game under investigation, and the players actions generated by the ASFP with inertia(players reluctance to change their previous actions) converge to a Nash equilibrium almost surely. For the case without broadcasting, a consensus protocol is introduced for individual agents to estimate the percentage of players choosing each resource, and the convergence property of players action profile is still ensured. The results are applied to road pricing design to achieve socially local optimal trip timing. Simulation results are provided based on the real traffic data for the Singapore case study.
基金supported by the Natural Science Foundation of Jiangsu Province,China(No.BK20240200)in part by the National Natural Science Foundation of China(Nos.62271501,62071488,62471489 and U22B2002)+1 种基金in part by the Key Technologies R&D Program of Jiangsu,China(Prospective and Key Technologies for Industry)(Nos.BE2023022 and BE2023022-4)in part by the Post-doctoral Fellowship Program of CPSF,China(No.GZB20240996).
文摘Due to the characteristics of line-of-sight(LoS)communication in unmanned aerial vehicle(UAV)networks,these systems are highly susceptible to eavesdropping and surveillance.To effectively address the security concerns in UAV communication,covert communication methods have been adopted.This paper explores the joint optimization problem of trajectory and transmission power in a multi-hop UAV relay covert communication system.Considering the communication covertness,power constraints,and trajectory limitations,an algorithm based on multi-agent proximal policy optimization(MAPPO),named covert-MAPPO(C-MAPPO),is proposed.The proposed method leverages the strengths of both optimization algorithms and reinforcement learning to analyze and make joint decisions on the transmission power and flight trajectory strategies for UAVs to achieve cooperation.Simulation results demonstrate that the proposed method can maximize the system throughput while satisfying covertness constraints,and it outperforms benchmark algorithms in terms of system throughput and reward convergence speed.
基金funded by National Natural Science Foundation of China No.62302520.
文摘Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,the issue of optimal defense timing remains underexplored.Current default approaches—periodic or overly frequent MTD triggers—lead to suboptimal trade-offs among system security,performance,and cost.The timing of MTD strategy activation critically impacts both defensive efficacy and operational overhead,yet existing frameworks inadequately address this temporal dimension.To bridge this gap,this paper proposes a Stackelberg-FlipIt game model that formalizes asymmetric cyber conflicts as alternating control over attack surfaces,thereby capturing the dynamic security state evolution of MTD systems.We introduce a belief factor to quantify information asymmetry during adversarial interactions,enhancing the precision of MTD trigger timing.Leveraging this game-theoretic foundation,we employMulti-Agent Reinforcement Learning(MARL)to derive adaptive temporal strategies,optimized via a novel four-dimensional reward function that holistically balances security,performance,cost,and timing.Experimental validation using IP addressmutation against scanning attacks demonstrates stable strategy convergence and accelerated defense response,significantly improving cybersecurity affordability and effectiveness.
基金supported in part by National Natural Science Foundation of China(62495095,62088101).
文摘Multi-agent systems(MASs)have demonstrated significant achievements in a wide range of tasks,leveraging their capacity for coordination and adaptation within complex environments.Moreover,the enhancement of their intelligent functionalities is crucial for tackling increasingly challenging tasks.This goal resonates with a paradigm shift within the artificial intelligence(AI)community,from“internet AI”to“embodied AI”,and the MASs with embodied AI are referred to as embodied multi-agent systems(EMASs).An EMAS has the potential to acquire generalized competencies through interactions with environments,enabling it to effectively address a variety of tasks and thereby make a substantial contribution to the quest for artificial general intelligence.Despite the burgeoning interest in this domain,a comprehensive review of EMAS has been lacking.This paper offers analysis and synthesis for EMASs from a control perspective,conceptualizing each embodied agent as an entity equipped with a“brain”for decision and a“body”for environmental interaction.System designs are classified into open-loop,closed-loop,and double-loop categories,and EMAS implementations are discussed.Additionally,the current applications and challenges faced by EMASs are summarized and potential avenues for future research in this field are provided.
基金2024 Jiangsu Province Youth Science and Technology Talent Support Project2024 Yancheng Key Research and Development Plan(Social Development)projects,“Research and Application of Multi Agent Offline Distributed Trust Perception Virtual Wireless Sensor Network Algorithm”and“Research and Application of a New Type of Fishery Ship Safety Production Monitoring Equipment”。
文摘This paper mainly focuses on the velocity-constrained consensus problem of discrete-time heterogeneous multi-agent systems with nonconvex constraints and arbitrarily switching topologies,where each agent has first-order or second-order dynamics.To solve this problem,a distributed algorithm is proposed based on a contraction operator.By employing the properties of the stochastic matrix,it is shown that all agents’position states could converge to a common point and second-order agents’velocity states could remain in corresponding nonconvex constraint sets and converge to zero as long as the joint communication topology has one directed spanning tree.Finally,the numerical simulation results are provided to verify the effectiveness of the proposed algorithms.
基金Supported by National Natural Science Foundation of China(Grant Nos.52225212,52272418,U22A20100)National Key Research and Development Program of China(Grant No.2022YFB2503302).
文摘The development of chassis active safety control technology has improved vehicle stability under extreme conditions.However,its cross-system and multi-functional characteristics make the controller difficult to achieve cooperative goals.In addition,the chassis system,which has high complexity,numerous subsystems,and strong coupling,will also lead to low computing efficiency and poor control effect of the controller.Therefore,this paper proposes a scenario-driven hybrid distributed model predictive control algorithm with variable control topology.This algorithm divides multiple stability regions based on the vehicle’s β−γ phase plane,forming a mapping relationship between the control structure and the vehicle’s state.A control input fusion mechanism within the transition domain is designed to mitigate the problems of system state oscillation and control input jitter caused by switching control structures.Then,a distributed state-space equation with state coupling and input coupling characteristics is constructed,and a weighted local agent cost function in quadratic programming is derived.Through cost coupling,local agents can coordinate global performance goals.Finally,through Simulink/CarSim joint simulation and hardware-in-the-loop(HIL)test,the proposed algorithm is validated to improve vehicle stability while ensuring trajectory tracking accuracy and has good applicability for multi-objective coordinated control.This paper combines the advantages of distributed MPC and decentralized MPC,achieving a balance between approximating the global optimal results and the solution’s efficiency.
基金National Natural Science Foundation of China(No.12071370)。
文摘This paper focuses on the problem of leaderfollowing consensus for nonlinear cascaded multi-agent systems.The control strategies for these systems are transformed into successive control problem schemes for lower-order error subsystems.A distributed consensus analysis for the corresponding error systems is conducted by employing recursive methods and virtual controllers,accompanied by a series of Lyapunov functions devised throughout the iterative process,which solves the leaderfollowing consensus problem of a class of nonlinear cascaded multi-agent systems.Specific simulation examples illustrate the effectiveness of the proposed control algorithm.
基金supported in part by the National Key Research and Development Program of China under Grant 2022YFB3303900in part by the National Natural Science Foundation of China under Grants 62103277 and 62025305。
文摘In this paper,the distributed optimal formation control problem of heterogeneous Euler–Lagrange multi-agent systems with generic formation constraints and inequality constraints is investigated.Based on the primal–dual dynamics and the adaptive control technique,a distributed optimal formation controller consists of a velocity reference signal generator and a velocity tracking controller is proposed.By using the optimality condition,the relationship between the equilibrium point of the closed-loop system and the optimal solution of the optimization problem is established.Then,by utilizing Lyapunov stability analysis,it is rigorously proved that the optimal formation is reached with the proposed controller.Lastly,simulation examples are provided to substantiate the theoretical results.
文摘This article investigates the time-varying output group formation tracking control(GFTC)problem for heterogeneous multi-agent systems(HMASs)under switching topologies.The objective is to design a distributed control strategy that enables the outputs of the followers to form the desired sub-formations and track the outputs of the leader in each subgroup.Firstly,novel distributed observers are developed to estimate the states of the leaders under switching topologies.Then,GFTC protocols are designed based on the proposed observers.It is shown that with the distributed protocol,the GFTC problem for HMASs under switching topologies is solved if the average dwell time associated with the switching topologies is larger than a fixed threshold.Finally,an example is provided to illustrate the effectiveness of the proposed control strategy.