Consensus problems of first-order multi-agent systems with multiple time delays are investigated in this paper. We discuss three cases: 1) continuous, 2) discrete, and 3) a continuous system with a proportional pl...Consensus problems of first-order multi-agent systems with multiple time delays are investigated in this paper. We discuss three cases: 1) continuous, 2) discrete, and 3) a continuous system with a proportional plus derivative controller. In each case, the system contains simultaneous communication and input time delays. Supposing a dynamic multi-agent system with directed topology that contains a globally reachable node, the sufficient convergence condition of the system is discussed with respect to each of the three cases based on the generalized Nyquist criterion and the frequency-domain analysis approach, yielding conclusions that are either less conservative than or agree with previously published results. We know that the convergence condition of the system depends mainly on each agent’s input time delay and the adjacent weights but is independent of the communication delay between agents, whether the system is continuous or discrete. Finally, simulation examples are given to verify the theoretical analysis.展开更多
This paper studies the consensus problems for a group of agents with switching topology and time-varying communication delays, where the dynamics of agents is modeled as a high-order integrator. A linear distributed c...This paper studies the consensus problems for a group of agents with switching topology and time-varying communication delays, where the dynamics of agents is modeled as a high-order integrator. A linear distributed consensus protocol is proposed, which only depends on the agent's own information and its neighbors' partial information. By introducing a decomposition of the state vector and performing a state space transformation, the closed-loop dynamics of the multi-agent system is converted into two decoupled subsystems. Based on the decoupled subsystems, some sufficient conditions for the convergence to consensus are established, which provide the upper bounds on the admissible communication delays. Also, the explicit expression of the consensus state is derived. Moreover, the results on the consensus seeking of the group of high-order agents have been extended to a network of agents with dynamics modeled as a completely controllable linear time-invariant system. It is proved that the convergence to consensus of this network is equivalent to that of the group of high-order agents. Finally, some numerical examples are given to demonstrate the effectiveness of the main results.展开更多
The time-varying network topology can significantly affect the stability of multi-agent systems.This paper examines the stability of leader-follower multi-agent systems with general linear dynamics and switching netwo...The time-varying network topology can significantly affect the stability of multi-agent systems.This paper examines the stability of leader-follower multi-agent systems with general linear dynamics and switching network topologies,which have applications in the platooning of connected vehicles.The switching interaction topology is modeled as a class of directed graphs in order to describe the information exchange between multi-agent systems,where the eigenvalues of every associated matrix are required to be positive real.The Hurwitz criterion and the Riccati inequality are used to design a distributed control law and estimate the convergence speed of the closed-loop system.A sufficient condition is provided for the stability of multi-agent systems under switching topologies.A common Lyapunov function is formulated to prove closed-loop stability for the directed network with switching topologies.The result is applied to a typical cyber-physical system—that is,a connected vehicle platoon—which illustrates the effectiveness of the proposed method.展开更多
This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards grea...This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards greater decentralization and renewable integration,traditional optimization methods struggle to address the inherent complexities and uncertainties.Our proposed MARL framework enables adaptive,decentralized decision-making for both the distribution system operator and individual VPPs,optimizing economic efficiency while maintaining grid stability.We formulate the problem as a Markov decision process and develop a custom MARL algorithm that leverages actor-critic architectures and experience replay.Extensive simulations across diverse scenarios demonstrate that our approach consistently outperforms baseline methods,including Stackelberg game models and model predictive control,achieving an 18.73%reduction in costs and a 22.46%increase in VPP profits.The MARL framework shows particular strength in scenarios with high renewable energy penetration,where it improves system performance by 11.95%compared with traditional methods.Furthermore,our approach demonstrates superior adaptability to unexpected events and mis-predictions,highlighting its potential for real-world implementation.展开更多
This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing...This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing in the cooperative pursuit tasks of multiple unmanned aerial vehicles(UAVs).Traditional algorithms often fail to effectively identify critical cooperative relationships in such tasks,leading to low capture efficiency and a significant decline in performance when the scale expands.To tackle these issues,based on the proximal policy optimization(PPO)algorithm,MA2PPO adopts the centralized training with decentralized execution(CTDE)framework and introduces a dynamic decoupling mechanism,that is,sharing the multi-head attention(MHA)mechanism for critics during centralized training to solve the credit assignment problem.This method enables the pursuers to identify highly correlated interactions with their teammates,effectively eliminate irrelevant and weakly relevant interactions,and decompose large-scale cooperation problems into decoupled sub-problems,thereby enhancing the collaborative efficiency and policy stability among multiple agents.Furthermore,a reward function has been devised to facilitate the pursuers to encircle the escapee by combining a formation reward with a distance reward,which incentivizes UAVs to develop sophisticated cooperative pursuit strategies.Experimental results demonstrate the effectiveness of the proposed algorithm in achieving multi-UAV cooperative pursuit and inducing diverse cooperative pursuit behaviors among UAVs.Moreover,experiments on scalability have demonstrated that the algorithm is suitable for large-scale multi-UAV systems.展开更多
The Industrial Internet of Things(IIoT)is increasingly vulnerable to sophisticated cyber threats,particularly zero-day attacks that exploit unknown vulnerabilities and evade traditional security measures.To address th...The Industrial Internet of Things(IIoT)is increasingly vulnerable to sophisticated cyber threats,particularly zero-day attacks that exploit unknown vulnerabilities and evade traditional security measures.To address this critical challenge,this paper proposes a dynamic defense framework named Zero-day-aware Stackelberg Game-based Multi-Agent Distributed Deep Deterministic Policy Gradient(ZSG-MAD3PG).The framework integrates Stackelberg game modeling with the Multi-Agent Distributed Deep Deterministic Policy Gradient(MAD3PG)algorithm and incorporates defensive deception(DD)strategies to achieve adaptive and efficient protection.While conventional methods typically incur considerable resource overhead and exhibit higher latency due to static or rigid defensive mechanisms,the proposed ZSG-MAD3PG framework mitigates these limitations through multi-stage game modeling and adaptive learning,enabling more efficient resource utilization and faster response times.The Stackelberg-based architecture allows defenders to dynamically optimize packet sampling strategies,while attackers adjust their tactics to reach rapid equilibrium.Furthermore,dynamic deception techniques reduce the time required for the concealment of attacks and the overall system burden.A lightweight behavioral fingerprinting detection mechanism further enhances real-time zero-day attack identification within industrial device clusters.ZSG-MAD3PG demonstrates higher true positive rates(TPR)and lower false alarm rates(FAR)compared to existing methods,while also achieving improved latency,resource efficiency,and stealth adaptability in IIoT zero-day defense scenarios.展开更多
Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metavers...Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.展开更多
The multi-agent path planning problem presents significant challenges in dynamic environments,primarily due to the ever-changing positions of obstacles and the complex interactions between agents’actions.These factor...The multi-agent path planning problem presents significant challenges in dynamic environments,primarily due to the ever-changing positions of obstacles and the complex interactions between agents’actions.These factors contribute to a tendency for the solution to converge slowly,and in some cases,diverge altogether.In addressing this issue,this paper introduces a novel approach utilizing a double dueling deep Q-network(D3QN),tailored for dynamic multi-agent environments.A novel reward function based on multi-agent positional constraints is designed,and a training strategy based on incremental learning is performed to achieve collaborative path planning of multiple agents.Moreover,the greedy and Boltzmann probability selection policy is introduced for action selection and avoiding convergence to local extremum.To match radar and image sensors,a convolutional neural network-long short-term memory(CNN-LSTM)architecture is constructed to extract the feature of multi-source measurement as the input of the D3QN.The algorithm’s efficacy and reliability are validated in a simulated environment,utilizing robot operating system and Gazebo.The simulation results show that the proposed algorithm provides a real-time solution for path planning tasks in dynamic scenarios.In terms of the average success rate and accuracy,the proposed method is superior to other deep learning algorithms,and the convergence speed is also improved.展开更多
This article investigates the problem of robust adaptive leaderless consensus for heterogeneous uncertain nonminimumphase linear multi-agent systems over directed communication graphs. Each agent is assumed tobe of un...This article investigates the problem of robust adaptive leaderless consensus for heterogeneous uncertain nonminimumphase linear multi-agent systems over directed communication graphs. Each agent is assumed tobe of unknown nominal dynamics and also subject to external disturbances and/or unmodeled dynamics. Anovel distributed robust adaptive control strategy is proposed. It is shown that the robust adaptive leaderlessconsensus problem is solved with the proposed control strategy under some sufficient conditions. Two examplesare provided to demonstrate the efficacy of the proposed control strategy.展开更多
Aiming at the problems of low solution accuracy and high decision pressure when facing large-scale dynamic task allocation(DTA)and high-dimensional decision space with single agent,this paper combines the deep reinfor...Aiming at the problems of low solution accuracy and high decision pressure when facing large-scale dynamic task allocation(DTA)and high-dimensional decision space with single agent,this paper combines the deep reinforce-ment learning(DRL)theory and an improved Multi-Agent Deep Deterministic Policy Gradient(MADDPG-D2)algorithm with a dual experience replay pool and a dual noise based on multi-agent architecture is proposed to improve the efficiency of DTA.The algorithm is based on the traditional Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm,and considers the introduction of a double noise mechanism to increase the action exploration space in the early stage of the algorithm,and the introduction of a double experience pool to improve the data utilization rate;at the same time,in order to accelerate the training speed and efficiency of the agents,and to solve the cold-start problem of the training,the a priori knowledge technology is applied to the training of the algorithm.Finally,the MADDPG-D2 algorithm is compared and analyzed based on the digital battlefield of ground and air confrontation.The experimental results show that the agents trained by the MADDPG-D2 algorithm have higher win rates and average rewards,can utilize the resources more reasonably,and better solve the problem of the traditional single agent algorithms facing the difficulty of solving the problem in the high-dimensional decision space.The MADDPG-D2 algorithm based on multi-agent architecture proposed in this paper has certain superiority and rationality in DTA.展开更多
Double-integrator multi-agent systems(MASs)might not achieve dynamical consensus,even if only partial agents suffer from self-sensing function failures(SSFFs).SSFFs might be asynchronous in real engineering applicatio...Double-integrator multi-agent systems(MASs)might not achieve dynamical consensus,even if only partial agents suffer from self-sensing function failures(SSFFs).SSFFs might be asynchronous in real engineering application.The existing fault-tolerant dynamical consensus protocol suitable for synchronous SSFFs cannot be directly used to tackle fault-tolerant dynamical consensus of double-integrator MASs with partial agents subject to asynchronous SSFFs.Motivated by these facts,this paper explores a new fault-tolerant dynamical consensus protocol suitable for asynchronous SSFFs.First,multi-hop communication together with the idea of treating asynchronous SSFFs as multiple piecewise synchronous SSFFs is used for recovering the connectivity of network topology among all normal agents.Second,a fault-tolerant dynamical consensus protocol is designed for double-integrator MASs by utilizing the history information of an agent subject to SSFF for computing its own state information at the instants when its minimum-hop normal neighbor set changes.Then,it is theoretically proved that if the strategy of network topology connectivity recovery and the fault-tolerant dynamical consensus protocol with proper time-varying gains are used simultaneously,double-integrator MASs with all normal agents and all agents subject to SSFFs can reach dynamical consensus.Finally,comparison numerical simulations are given to illustrate the effectiveness of the theoretical results.展开更多
This paper investigates the consensus control of multi-agent systems(MASs) with constrained input using the dynamic event-triggered mechanism(ETM).Consider the MASs with small-scale networks where a centralized dynami...This paper investigates the consensus control of multi-agent systems(MASs) with constrained input using the dynamic event-triggered mechanism(ETM).Consider the MASs with small-scale networks where a centralized dynamic ETM with global information of the MASs is first designed.Then,a distributed dynamic ETM which only uses local information is developed for the MASs with large-scale networks.It is shown that the semi-global consensus of the MASs can be achieved by the designed bounded control protocol where the Zeno phenomenon is eliminated by a designable minimum inter-event time.In addition,it is easier to find a trade-off between the convergence rate and the minimum inter-event time by an adjustable parameter.Furthermore,the results are extended to regional consensus of the MASs with the bounded control protocol.Numerical simulations show the effectiveness of the proposed approach.展开更多
This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary obj...This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary objective is to explore the unknown environments to locate and track targets effectively. To address this problem, we propose a novel Multi-Agent Reinforcement Learning (MARL) method based on Graph Neural Network (GNN). Firstly, a method is introduced for encoding continuous-space multi-UAV problem data into spatial graphs which establish essential relationships among agents, obstacles, and targets. Secondly, a Graph AttenTion network (GAT) model is presented, which focuses exclusively on adjacent nodes, learns attention weights adaptively and allows agents to better process information in dynamic environments. Reward functions are specifically designed to tackle exploration challenges in environments with sparse rewards. By introducing a framework that integrates centralized training and distributed execution, the advancement of models is facilitated. Simulation results show that the proposed method outperforms the existing MARL method in search rate and tracking performance with less collisions. The experiments show that the proposed method can be extended to applications with a larger number of agents, which provides a potential solution to the challenging problem of multi-UAV autonomous tracking in dynamic unknown environments.展开更多
It is of great scientific significance to construct a 3D dynamic structural color with a special color effect based on the microlens array.However,the problems of imperfect mechanisms and poor color quality need to be...It is of great scientific significance to construct a 3D dynamic structural color with a special color effect based on the microlens array.However,the problems of imperfect mechanisms and poor color quality need to be solved.A method of 3D structural color turning on periodic metasurfaces fabricated by the microlens array and self-assembly technology was proposed in this study.In the experiment,Polydimethylsiloxane(PDMS)flexible film was used as a substrate,and SiO2 microspheres were scraped into grooves of the PDMS film to form 3D photonic crystal structures.By adjusting the number of blade-coated times and microsphere concentrations,high-saturation structural color micropatterns were obtained.These films were then matched with microlens arrays to produce dynamic graphics with iridescent effects.The results showed that by blade-coated two times and SiO2 microsphere concentrations of 50%are the best conditions.This method demonstrates the potential for being widely applied in the anticounterfeiting printing and ultra-high-resolution display.展开更多
Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-...Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.展开更多
The hot deformation behavior of Pt−10Ir alloy was studied under a wide range of deformation parameters.At a low deformation temperature(950−1150℃),the softening mechanism is primarily dynamic recovery.In addition,dyn...The hot deformation behavior of Pt−10Ir alloy was studied under a wide range of deformation parameters.At a low deformation temperature(950−1150℃),the softening mechanism is primarily dynamic recovery.In addition,dynamic recrystallization by progressive lattice rotation near grain boundaries(DRX by LRGBs)and microshear bands assisted dynamic recrystallization(MSBs assisted DRX)coordinate the deformation.However,it is difficult for the dynamic softening to offset the stain hardening due to a limited amount of DRXed grains.At a high deformation temperature(1250−1350℃),three main DRX mechanisms associated with strain rates occur:DRX by LRGBs,DRX by a homogeneous increase in misorientation(HIM)and geometric DRX(GDRX).With increasing strain,DRX by LRGBs is enhanced gradually under high strain rates;the“pinch-off”effect is enhanced at low strain rates,which was conducive to the formation of a uniform and fine microstructure.展开更多
This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method...This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system.展开更多
To improve the vertical axis wind turbine(VAWT)design,the angle of attack(AOA)and airfoil data must be treated correctly.The present paper develops a method for determining AOA on a VAWT based on computational fluid d...To improve the vertical axis wind turbine(VAWT)design,the angle of attack(AOA)and airfoil data must be treated correctly.The present paper develops a method for determining AOA on a VAWT based on computational fluid dynamics(CFD)analysis.First,a CFD analysis of a two-bladed VAWT equipped with a NACA 0012 airfoil is conducted.The thrust and power coefficients are validated through experiments.Second,the blade force and velocity data at monitoring points are collected.The AOA at different azimuth angles is determined by removing the blade self-induction at the monitoring point.Then,the lift and drag coefficients as a function of AOA are extracted.Results show that this method is independent of the monitoring points selection located at certain distance to the blades and the extracted dynamic stall hysteresis is more precise than the one with the“usual”method without considering the self-induction from bound vortices.展开更多
Metal Additive Manufacturing(MAM) technology has become an important means of rapid prototyping precision manufacturing of special high dynamic heterogeneous complex parts. In response to the micromechanical defects s...Metal Additive Manufacturing(MAM) technology has become an important means of rapid prototyping precision manufacturing of special high dynamic heterogeneous complex parts. In response to the micromechanical defects such as porosity issues, significant deformation, surface cracks, and challenging control of surface morphology encountered during the selective laser melting(SLM) additive manufacturing(AM) process of specialized Micro Electromechanical System(MEMS) components, multiparameter optimization and micro powder melt pool/macro-scale mechanical properties control simulation of specialized components are conducted. The optimal parameters obtained through highprecision preparation and machining of components and static/high dynamic verification are: laser power of 110 W, laser speed of 600 mm/s, laser diameter of 75 μm, and scanning spacing of 50 μm. The density of the subordinate components under this reference can reach 99.15%, the surface hardness can reach 51.9 HRA, the yield strength can reach 550 MPa, the maximum machining error of the components is 4.73%, and the average surface roughness is 0.45 μm. Through dynamic hammering and high dynamic firing verification, SLM components meet the requirements for overload resistance. The results have proven that MEM technology can provide a new means for the processing of MEMS components applied in high dynamic environments. The parameters obtained in the conclusion can provide a design basis for the additive preparation of MEMS components.展开更多
基金Project supported in part by the National Natural Science Foundation of China (Grant Nos. 60973114 and 61170249)in part by the Natural Science Foundation of CQCSTC (Grant Nos. 2009BA2024 and cstc2011jjA1320)in part by the State Key Laboratory of Power Transmission Equipment & System Securityand New Technology, Chongqing University (Grant No. 2007DA10512711206)
文摘Consensus problems of first-order multi-agent systems with multiple time delays are investigated in this paper. We discuss three cases: 1) continuous, 2) discrete, and 3) a continuous system with a proportional plus derivative controller. In each case, the system contains simultaneous communication and input time delays. Supposing a dynamic multi-agent system with directed topology that contains a globally reachable node, the sufficient convergence condition of the system is discussed with respect to each of the three cases based on the generalized Nyquist criterion and the frequency-domain analysis approach, yielding conclusions that are either less conservative than or agree with previously published results. We know that the convergence condition of the system depends mainly on each agent’s input time delay and the adjacent weights but is independent of the communication delay between agents, whether the system is continuous or discrete. Finally, simulation examples are given to verify the theoretical analysis.
基金supported by the National Natural Science Foundation of China(No.60674050,60736022,10972002,60774089,60704039)
文摘This paper studies the consensus problems for a group of agents with switching topology and time-varying communication delays, where the dynamics of agents is modeled as a high-order integrator. A linear distributed consensus protocol is proposed, which only depends on the agent's own information and its neighbors' partial information. By introducing a decomposition of the state vector and performing a state space transformation, the closed-loop dynamics of the multi-agent system is converted into two decoupled subsystems. Based on the decoupled subsystems, some sufficient conditions for the convergence to consensus are established, which provide the upper bounds on the admissible communication delays. Also, the explicit expression of the consensus state is derived. Moreover, the results on the consensus seeking of the group of high-order agents have been extended to a network of agents with dynamics modeled as a completely controllable linear time-invariant system. It is proved that the convergence to consensus of this network is equivalent to that of the group of high-order agents. Finally, some numerical examples are given to demonstrate the effectiveness of the main results.
基金This work is supported by International Science and Technology Cooperation Program of China(2019YFE0100200)Beijing Natural Science Foundation(JQ18010).It is also partially supported by Tsinghua University-Didi Joint Research Center for Future Mobility.
文摘The time-varying network topology can significantly affect the stability of multi-agent systems.This paper examines the stability of leader-follower multi-agent systems with general linear dynamics and switching network topologies,which have applications in the platooning of connected vehicles.The switching interaction topology is modeled as a class of directed graphs in order to describe the information exchange between multi-agent systems,where the eigenvalues of every associated matrix are required to be positive real.The Hurwitz criterion and the Riccati inequality are used to design a distributed control law and estimate the convergence speed of the closed-loop system.A sufficient condition is provided for the stability of multi-agent systems under switching topologies.A common Lyapunov function is formulated to prove closed-loop stability for the directed network with switching topologies.The result is applied to a typical cyber-physical system—that is,a connected vehicle platoon—which illustrates the effectiveness of the proposed method.
基金supported by the Science and Technology Project of State Grid Sichuan Electric Power Company Chengdu Power Supply Company under Grant No.521904240005.
文摘This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards greater decentralization and renewable integration,traditional optimization methods struggle to address the inherent complexities and uncertainties.Our proposed MARL framework enables adaptive,decentralized decision-making for both the distribution system operator and individual VPPs,optimizing economic efficiency while maintaining grid stability.We formulate the problem as a Markov decision process and develop a custom MARL algorithm that leverages actor-critic architectures and experience replay.Extensive simulations across diverse scenarios demonstrate that our approach consistently outperforms baseline methods,including Stackelberg game models and model predictive control,achieving an 18.73%reduction in costs and a 22.46%increase in VPP profits.The MARL framework shows particular strength in scenarios with high renewable energy penetration,where it improves system performance by 11.95%compared with traditional methods.Furthermore,our approach demonstrates superior adaptability to unexpected events and mis-predictions,highlighting its potential for real-world implementation.
基金supported by the National Research and Development Program of China under Grant JCKY2018607C019in part by the Key Laboratory Fund of UAV of Northwestern Polytechnical University under Grant 2021JCJQLB0710L.
文摘This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing in the cooperative pursuit tasks of multiple unmanned aerial vehicles(UAVs).Traditional algorithms often fail to effectively identify critical cooperative relationships in such tasks,leading to low capture efficiency and a significant decline in performance when the scale expands.To tackle these issues,based on the proximal policy optimization(PPO)algorithm,MA2PPO adopts the centralized training with decentralized execution(CTDE)framework and introduces a dynamic decoupling mechanism,that is,sharing the multi-head attention(MHA)mechanism for critics during centralized training to solve the credit assignment problem.This method enables the pursuers to identify highly correlated interactions with their teammates,effectively eliminate irrelevant and weakly relevant interactions,and decompose large-scale cooperation problems into decoupled sub-problems,thereby enhancing the collaborative efficiency and policy stability among multiple agents.Furthermore,a reward function has been devised to facilitate the pursuers to encircle the escapee by combining a formation reward with a distance reward,which incentivizes UAVs to develop sophisticated cooperative pursuit strategies.Experimental results demonstrate the effectiveness of the proposed algorithm in achieving multi-UAV cooperative pursuit and inducing diverse cooperative pursuit behaviors among UAVs.Moreover,experiments on scalability have demonstrated that the algorithm is suitable for large-scale multi-UAV systems.
基金funded in part by the Humanities and Social Sciences Planning Foundation of Ministry of Education of China under Grant No.24YJAZH123National Undergraduate Innovation and Entrepreneurship Training Program of China under Grant No.202510347069the Huzhou Science and Technology Planning Foundation under Grant No.2023GZ04.
文摘The Industrial Internet of Things(IIoT)is increasingly vulnerable to sophisticated cyber threats,particularly zero-day attacks that exploit unknown vulnerabilities and evade traditional security measures.To address this critical challenge,this paper proposes a dynamic defense framework named Zero-day-aware Stackelberg Game-based Multi-Agent Distributed Deep Deterministic Policy Gradient(ZSG-MAD3PG).The framework integrates Stackelberg game modeling with the Multi-Agent Distributed Deep Deterministic Policy Gradient(MAD3PG)algorithm and incorporates defensive deception(DD)strategies to achieve adaptive and efficient protection.While conventional methods typically incur considerable resource overhead and exhibit higher latency due to static or rigid defensive mechanisms,the proposed ZSG-MAD3PG framework mitigates these limitations through multi-stage game modeling and adaptive learning,enabling more efficient resource utilization and faster response times.The Stackelberg-based architecture allows defenders to dynamically optimize packet sampling strategies,while attackers adjust their tactics to reach rapid equilibrium.Furthermore,dynamic deception techniques reduce the time required for the concealment of attacks and the overall system burden.A lightweight behavioral fingerprinting detection mechanism further enhances real-time zero-day attack identification within industrial device clusters.ZSG-MAD3PG demonstrates higher true positive rates(TPR)and lower false alarm rates(FAR)compared to existing methods,while also achieving improved latency,resource efficiency,and stealth adaptability in IIoT zero-day defense scenarios.
基金supported in part by NSFC (62102099, U22A2054, 62101594)in part by the Pearl River Talent Recruitment Program (2021QN02S643)+9 种基金Guangzhou Basic Research Program (2023A04J1699)in part by the National Research Foundation, SingaporeInfocomm Media Development Authority under its Future Communications Research Development ProgrammeDSO National Laboratories under the AI Singapore Programme under AISG Award No AISG2-RP-2020-019Energy Research Test-Bed and Industry Partnership Funding Initiative, Energy Grid (EG) 2.0 programmeDesCartes and the Campus for Research Excellence and Technological Enterprise (CREATE) programmeMOE Tier 1 under Grant RG87/22in part by the Singapore University of Technology and Design (SUTD) (SRG-ISTD-2021- 165)in part by the SUTD-ZJU IDEA Grant SUTD-ZJU (VP) 202102in part by the Ministry of Education, Singapore, through its SUTD Kickstarter Initiative (SKI 20210204)。
文摘Avatars, as promising digital representations and service assistants of users in Metaverses, can enable drivers and passengers to immerse themselves in 3D virtual services and spaces of UAV-assisted vehicular Metaverses. However, avatar tasks include a multitude of human-to-avatar and avatar-to-avatar interactive applications, e.g., augmented reality navigation,which consumes intensive computing resources. It is inefficient and impractical for vehicles to process avatar tasks locally. Fortunately, migrating avatar tasks to the nearest roadside units(RSU)or unmanned aerial vehicles(UAV) for execution is a promising solution to decrease computation overhead and reduce task processing latency, while the high mobility of vehicles brings challenges for vehicles to independently perform avatar migration decisions depending on current and future vehicle status. To address these challenges, in this paper, we propose a novel avatar task migration system based on multi-agent deep reinforcement learning(MADRL) to execute immersive vehicular avatar tasks dynamically. Specifically, we first formulate the problem of avatar task migration from vehicles to RSUs/UAVs as a partially observable Markov decision process that can be solved by MADRL algorithms. We then design the multi-agent proximal policy optimization(MAPPO) approach as the MADRL algorithm for the avatar task migration problem. To overcome slow convergence resulting from the curse of dimensionality and non-stationary issues caused by shared parameters in MAPPO, we further propose a transformer-based MAPPO approach via sequential decision-making models for the efficient representation of relationships among agents. Finally, to motivate terrestrial or non-terrestrial edge servers(e.g., RSUs or UAVs) to share computation resources and ensure traceability of the sharing records, we apply smart contracts and blockchain technologies to achieve secure sharing management. Numerical results demonstrate that the proposed approach outperforms the MAPPO approach by around 2% and effectively reduces approximately 20% of the latency of avatar task execution in UAV-assisted vehicular Metaverses.
基金National Natural Science Foundation of China(Nos.61673262 and 50779033)National GF Basic Research Program(No.JCKY2021110B134)Fundamental Research Funds for the Central Universities。
文摘The multi-agent path planning problem presents significant challenges in dynamic environments,primarily due to the ever-changing positions of obstacles and the complex interactions between agents’actions.These factors contribute to a tendency for the solution to converge slowly,and in some cases,diverge altogether.In addressing this issue,this paper introduces a novel approach utilizing a double dueling deep Q-network(D3QN),tailored for dynamic multi-agent environments.A novel reward function based on multi-agent positional constraints is designed,and a training strategy based on incremental learning is performed to achieve collaborative path planning of multiple agents.Moreover,the greedy and Boltzmann probability selection policy is introduced for action selection and avoiding convergence to local extremum.To match radar and image sensors,a convolutional neural network-long short-term memory(CNN-LSTM)architecture is constructed to extract the feature of multi-source measurement as the input of the D3QN.The algorithm’s efficacy and reliability are validated in a simulated environment,utilizing robot operating system and Gazebo.The simulation results show that the proposed algorithm provides a real-time solution for path planning tasks in dynamic scenarios.In terms of the average success rate and accuracy,the proposed method is superior to other deep learning algorithms,and the convergence speed is also improved.
基金Research Grants Council of Hong Kong under Grant CityU-11205221.
文摘This article investigates the problem of robust adaptive leaderless consensus for heterogeneous uncertain nonminimumphase linear multi-agent systems over directed communication graphs. Each agent is assumed tobe of unknown nominal dynamics and also subject to external disturbances and/or unmodeled dynamics. Anovel distributed robust adaptive control strategy is proposed. It is shown that the robust adaptive leaderlessconsensus problem is solved with the proposed control strategy under some sufficient conditions. Two examplesare provided to demonstrate the efficacy of the proposed control strategy.
基金This research was funded by the Project of the National Natural Science Foundation of China,Grant Number 62106283.
文摘Aiming at the problems of low solution accuracy and high decision pressure when facing large-scale dynamic task allocation(DTA)and high-dimensional decision space with single agent,this paper combines the deep reinforce-ment learning(DRL)theory and an improved Multi-Agent Deep Deterministic Policy Gradient(MADDPG-D2)algorithm with a dual experience replay pool and a dual noise based on multi-agent architecture is proposed to improve the efficiency of DTA.The algorithm is based on the traditional Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm,and considers the introduction of a double noise mechanism to increase the action exploration space in the early stage of the algorithm,and the introduction of a double experience pool to improve the data utilization rate;at the same time,in order to accelerate the training speed and efficiency of the agents,and to solve the cold-start problem of the training,the a priori knowledge technology is applied to the training of the algorithm.Finally,the MADDPG-D2 algorithm is compared and analyzed based on the digital battlefield of ground and air confrontation.The experimental results show that the agents trained by the MADDPG-D2 algorithm have higher win rates and average rewards,can utilize the resources more reasonably,and better solve the problem of the traditional single agent algorithms facing the difficulty of solving the problem in the high-dimensional decision space.The MADDPG-D2 algorithm based on multi-agent architecture proposed in this paper has certain superiority and rationality in DTA.
基金National Natural Science Foundation of China(No.61876073)Fundamental Research Funds for the Central Universities of China(No.JUSRP21920)。
文摘Double-integrator multi-agent systems(MASs)might not achieve dynamical consensus,even if only partial agents suffer from self-sensing function failures(SSFFs).SSFFs might be asynchronous in real engineering application.The existing fault-tolerant dynamical consensus protocol suitable for synchronous SSFFs cannot be directly used to tackle fault-tolerant dynamical consensus of double-integrator MASs with partial agents subject to asynchronous SSFFs.Motivated by these facts,this paper explores a new fault-tolerant dynamical consensus protocol suitable for asynchronous SSFFs.First,multi-hop communication together with the idea of treating asynchronous SSFFs as multiple piecewise synchronous SSFFs is used for recovering the connectivity of network topology among all normal agents.Second,a fault-tolerant dynamical consensus protocol is designed for double-integrator MASs by utilizing the history information of an agent subject to SSFF for computing its own state information at the instants when its minimum-hop normal neighbor set changes.Then,it is theoretically proved that if the strategy of network topology connectivity recovery and the fault-tolerant dynamical consensus protocol with proper time-varying gains are used simultaneously,double-integrator MASs with all normal agents and all agents subject to SSFFs can reach dynamical consensus.Finally,comparison numerical simulations are given to illustrate the effectiveness of the theoretical results.
基金supported in part by the National Natural Science Foundation of China(51939001,61976033,62273072)the Natural Science Foundation of Sichuan Province (2022NSFSC0903)。
文摘This paper investigates the consensus control of multi-agent systems(MASs) with constrained input using the dynamic event-triggered mechanism(ETM).Consider the MASs with small-scale networks where a centralized dynamic ETM with global information of the MASs is first designed.Then,a distributed dynamic ETM which only uses local information is developed for the MASs with large-scale networks.It is shown that the semi-global consensus of the MASs can be achieved by the designed bounded control protocol where the Zeno phenomenon is eliminated by a designable minimum inter-event time.In addition,it is easier to find a trade-off between the convergence rate and the minimum inter-event time by an adjustable parameter.Furthermore,the results are extended to regional consensus of the MASs with the bounded control protocol.Numerical simulations show the effectiveness of the proposed approach.
基金supported by the National Natural Science Foundation of China(Nos.12272104,U22B2013).
文摘This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary objective is to explore the unknown environments to locate and track targets effectively. To address this problem, we propose a novel Multi-Agent Reinforcement Learning (MARL) method based on Graph Neural Network (GNN). Firstly, a method is introduced for encoding continuous-space multi-UAV problem data into spatial graphs which establish essential relationships among agents, obstacles, and targets. Secondly, a Graph AttenTion network (GAT) model is presented, which focuses exclusively on adjacent nodes, learns attention weights adaptively and allows agents to better process information in dynamic environments. Reward functions are specifically designed to tackle exploration challenges in environments with sparse rewards. By introducing a framework that integrates centralized training and distributed execution, the advancement of models is facilitated. Simulation results show that the proposed method outperforms the existing MARL method in search rate and tracking performance with less collisions. The experiments show that the proposed method can be extended to applications with a larger number of agents, which provides a potential solution to the challenging problem of multi-UAV autonomous tracking in dynamic unknown environments.
文摘It is of great scientific significance to construct a 3D dynamic structural color with a special color effect based on the microlens array.However,the problems of imperfect mechanisms and poor color quality need to be solved.A method of 3D structural color turning on periodic metasurfaces fabricated by the microlens array and self-assembly technology was proposed in this study.In the experiment,Polydimethylsiloxane(PDMS)flexible film was used as a substrate,and SiO2 microspheres were scraped into grooves of the PDMS film to form 3D photonic crystal structures.By adjusting the number of blade-coated times and microsphere concentrations,high-saturation structural color micropatterns were obtained.These films were then matched with microlens arrays to produce dynamic graphics with iridescent effects.The results showed that by blade-coated two times and SiO2 microsphere concentrations of 50%are the best conditions.This method demonstrates the potential for being widely applied in the anticounterfeiting printing and ultra-high-resolution display.
基金The National Natural Science Foundation of China(62136008,62293541)The Beijing Natural Science Foundation(4232056)The Beijing Nova Program(20240484514).
文摘Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.
基金financial supports from the National Natural Science Foundation of China(Nos.52161023,51901204)Science and Technology Project of Yunnan Precious Metal Laboratory,China(No.YPML-2023050208)+1 种基金Yunnan Science and Technology Planning Project,China(Nos.202201AU070010,202301AT070276,202302AB080008,202303AA080001)Postgraduate Research and Innovation Foundation of Yunnan University,China(No.2021Y338).
文摘The hot deformation behavior of Pt−10Ir alloy was studied under a wide range of deformation parameters.At a low deformation temperature(950−1150℃),the softening mechanism is primarily dynamic recovery.In addition,dynamic recrystallization by progressive lattice rotation near grain boundaries(DRX by LRGBs)and microshear bands assisted dynamic recrystallization(MSBs assisted DRX)coordinate the deformation.However,it is difficult for the dynamic softening to offset the stain hardening due to a limited amount of DRXed grains.At a high deformation temperature(1250−1350℃),three main DRX mechanisms associated with strain rates occur:DRX by LRGBs,DRX by a homogeneous increase in misorientation(HIM)and geometric DRX(GDRX).With increasing strain,DRX by LRGBs is enhanced gradually under high strain rates;the“pinch-off”effect is enhanced at low strain rates,which was conducive to the formation of a uniform and fine microstructure.
基金The National Natural Science Foundation of China(W2431048)The Science and Technology Research Program of Chongqing Municipal Education Commission,China(KJZDK202300807)The Chongqing Natural Science Foundation,China(CSTB2024NSCQQCXMX0052).
文摘This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system.
文摘To improve the vertical axis wind turbine(VAWT)design,the angle of attack(AOA)and airfoil data must be treated correctly.The present paper develops a method for determining AOA on a VAWT based on computational fluid dynamics(CFD)analysis.First,a CFD analysis of a two-bladed VAWT equipped with a NACA 0012 airfoil is conducted.The thrust and power coefficients are validated through experiments.Second,the blade force and velocity data at monitoring points are collected.The AOA at different azimuth angles is determined by removing the blade self-induction at the monitoring point.Then,the lift and drag coefficients as a function of AOA are extracted.Results show that this method is independent of the monitoring points selection located at certain distance to the blades and the extracted dynamic stall hysteresis is more precise than the one with the“usual”method without considering the self-induction from bound vortices.
基金funded by the National Natural Science Foundation of China Youth Fund(Grant No.62304022)Science and Technology on Electromechanical Dynamic Control Laboratory(China,Grant No.6142601012304)the 2022e2024 China Association for Science and Technology Innovation Integration Association Youth Talent Support Project(Grant No.2022QNRC001).
文摘Metal Additive Manufacturing(MAM) technology has become an important means of rapid prototyping precision manufacturing of special high dynamic heterogeneous complex parts. In response to the micromechanical defects such as porosity issues, significant deformation, surface cracks, and challenging control of surface morphology encountered during the selective laser melting(SLM) additive manufacturing(AM) process of specialized Micro Electromechanical System(MEMS) components, multiparameter optimization and micro powder melt pool/macro-scale mechanical properties control simulation of specialized components are conducted. The optimal parameters obtained through highprecision preparation and machining of components and static/high dynamic verification are: laser power of 110 W, laser speed of 600 mm/s, laser diameter of 75 μm, and scanning spacing of 50 μm. The density of the subordinate components under this reference can reach 99.15%, the surface hardness can reach 51.9 HRA, the yield strength can reach 550 MPa, the maximum machining error of the components is 4.73%, and the average surface roughness is 0.45 μm. Through dynamic hammering and high dynamic firing verification, SLM components meet the requirements for overload resistance. The results have proven that MEM technology can provide a new means for the processing of MEMS components applied in high dynamic environments. The parameters obtained in the conclusion can provide a design basis for the additive preparation of MEMS components.