In this paper, the leader-following tracking problem of fractional-order multi-agent systems is addressed. The dynamics of each agent may be heterogeneous and has unknown nonlinearities. By assumptions that the intera...In this paper, the leader-following tracking problem of fractional-order multi-agent systems is addressed. The dynamics of each agent may be heterogeneous and has unknown nonlinearities. By assumptions that the interaction topology is undirected and connected and the unknown nonlinear uncertain dynamics can be parameterized by a neural network, an adaptive learning law is proposed to deal with unknown nonlinear dynamics, based on which a kind of cooperative tracking protocols are constructed. The feedback gain matrix is obtained to solve an algebraic Riccati equation. To construct the fully distributed cooperative tracking protocols, the adaptive law is also adopted to adjust the coupling weight. With the developed control laws,we can prove that all signals in the closed-loop systems are guaranteed to be uniformly ultimately bounded. Finally, a simple simulation example is provided to illustrate the established result.展开更多
In this paper,we study some new fractional-order multi-agent systems with current and delay states (FMASCD).Using the generalized Nyquist's stability criterion and Gerschgorin's circle theorem,we obtain the bo...In this paper,we study some new fractional-order multi-agent systems with current and delay states (FMASCD).Using the generalized Nyquist's stability criterion and Gerschgorin's circle theorem,we obtain the bounded input-bounded output (BIBO) stability and asymptotical consensus of the FMASCD under mild conditions.Moreover,we give some numerical examples to illustrate our main results.展开更多
Leader-following consensus of fractional order multi-agent systems is investigated. The agents are considered as discrete-time fractional order integrators or fractional order double-integrators. Moreover, the interac...Leader-following consensus of fractional order multi-agent systems is investigated. The agents are considered as discrete-time fractional order integrators or fractional order double-integrators. Moreover, the interaction between the agents is described with an undirected communication graph with a fixed topology. It is shown that the leader-following consensus problem for the considered agents could be converted to the asymptotic stability analysis of a discrete-time fractional order system. Based on this idea, sufficient conditions to reach the leader-following consensus in terms of the controller parameters are extracted. This leads to an appropriate region in the controller parameters space. Numerical simulations are provided to show the performance of the proposed leader-following consensus approach.展开更多
Aiming at the consensus of relative position considering obstacle avoidance for fractional-order multi-agent system,a novel distributed control algorithm is proposed in this paper.Firstly,a synthetic error of each age...Aiming at the consensus of relative position considering obstacle avoidance for fractional-order multi-agent system,a novel distributed control algorithm is proposed in this paper.Firstly,a synthetic error of each agent under the influence of obstacles is introduced.The consensus pro-tocols are designed based on this eror according to sliding mode theory for the order increasing and decreasing,respectively.Then,the Lyapunov function is used to prove the stable convergence of the protocols.Finally,the simulation results show that the protocols can not only prevent the agents from colliding with obstacles,but also enable the agents to quickly recover the expected formation and achieve consensus of the relative position.展开更多
This paper investigates adaptive containment control for a class of fractional-order multi-agent systems(FOMASs)with time-varying parameters and disturbances.By using the bounded estimation method,the difficulty gener...This paper investigates adaptive containment control for a class of fractional-order multi-agent systems(FOMASs)with time-varying parameters and disturbances.By using the bounded estimation method,the difficulty generated by the timevarying parameters and disturbances is overcome.The command filter is introduced to solve the complexity problem inherent in adaptive backstepping control.Meanwhile,in order to eliminate the effect of filter errors,a novel distributed error compensating scheme is constructed,in which only the local information from the neighbor agents is utilized.Then,a distributed adaptive containment control scheme for FOMASs is developed based on backstepping to guarantee that the outputs of all the followers are steered to the convex hull spanned by the leaders.Based on the extension of Barbalat's lemma to fractional-order integrals,it can be proven that the containment errors and the compensating signals have asymptotic convergence.Finally,three simulation examples are given to show the feasibility and effectiveness of the proposed control method.展开更多
Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused ...Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused by the strong inter-agent coupling nature embedded in an MARL problem,which is yet to be fully exploited in existing algorithms.In this work,we recognize a learning graph characterizing the dependence between individual rewards and individual policies.Then we propose a graph-based reward aggregation(GRA)method,which utilizes the inherent coupling relationship among agents to eliminate redundant information.Specifically,GRA passes information among cooperating agents through graph attention networks to obtain aggregated rewards that contribute to the fitting of the value function,making each agent learn a decentralized executable cooperation policy.In addition,we propose a variant of GRA,named GRA-decen,which achieves decentralized training and decentralized execution(DTDE)when each agent only has access to information of partial agents in the learning process.We conduct experiments in different environments and demonstrate the practicality and scalability of our algorithms.展开更多
This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination me...This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination methods that are solved by neural dynamics,the proposed strategy displays greater flexibility,adaptability and scalability.Furthermore,the proposed AMAC strategy is reconstructed as a time-varying complex-valued matrix equation.By introducing a dynamic error function,a fixed-time convergent zeroing neural network(FTCZNN)model is designed for the online solution of the AMAC strategy,with its convergence time upper bound derived theoretically.Finally,the effectiveness and applicability of the coordination control method are demonstrated by numerical simulations and physical experiments.Numerical results indicate that this method can reduce the formation error to the order of 10^(-6)within 1.8 s.展开更多
This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to es...This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to estimate higher-order synchronization errors,enabling the controller to rely solely on relative output measurements.This approach significantly reduces the dependence on full-state information,which is often infeasible or costly in practical engineering applications.An output feedback control strategy is developed to overcome these limitations while ensuring robust and effective synchronization.Simulation results are provided to demonstrate the effectiveness of the proposed approach and validate the theoretical findings.展开更多
Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and colla...Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and collaboration.However,most existing unsupervised meta-learning methods are centralized and not suitable for multi-agent systems where data are distributed stored and inaccessible to all agents.Meta-GMVAE,based on Variational Autoencoder(VAE)and set-level variational inference,represents a sophisticated unsupervised meta-learning model that improves generative performance by efficiently learning data representations across various tasks,increasing adaptability and reducing sample requirements.Inspired by these advancements,we propose a novel Distributed Unsupervised Meta-Learning(DUML)framework based on Meta-GMVAE and a fusion strategy.Furthermore,we present a DUML algorithm based on Gaussian Mixture Model(DUMLGMM),where the parameters of the Gaussian-mixture are solved by an Expectation-Maximization algorithm.Simulations on Omniglot and Mini Image Net datasets show that DUMLGMM can achieve the performance of the corresponding centralized algorithm and outperform non-cooperative algorithm.展开更多
In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Mu...In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education.展开更多
This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consen...This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consensus under average dwell time switching.Then sufficient conditions are derived to guarantee the positive consensus.The gain matrices of the control protocol are described using a matrix decomposition approach and the corresponding computational complexity is reduced by resorting to linear programming and co-positive Lyapunov functions.Finally,two numerical examples are provided to illustrate the results obtained.展开更多
With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier...With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics.展开更多
To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on m...To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on multiagent deep deterministic policy gradient(MaDDPG).The ECs are divided into conventional ECs and the electric vehicles(EVs)which are managed by ECs agent(ECA)and EV agent(EVA)to exploit the flexibility of the HDR framework.Thus,the HDR is a tri-layer model determined by five types of agents engaging in competing interaction to maximize their own profits.To address the limitations of mathematical expression and participation scale in the Stackelberg game within the HDR model,a dynamic interaction mechanism is adopted.Moreover,to tackle the HDR involving various entities,the MaDDPG develops multiple agents to simulation the dynamic competing interactions between each subject as well as solve the problem of continuous action control.Furthermore,MaDDPG adopts soft target update and priority experience replay method to ensure stable and effective training,and makes the exploration strategy comprehensive by using exploration noise.Simulation studies are conducted to verify the performance of the MaDDPG with dynamic interaction mechanism in dealing with multilayer multi-agent continuous action control,compared to the double deep Q network(DDQN),deep Q network(DQN)and dueling DQN.Additionally,comparisons among the proposed HDR with the price based DR(PBDR)and incentive based DR(IBDR)are analyzed to investigate the flexibility of the HDR.展开更多
To solve the problem of in-flight actuator faults and parameter uncertainties for multiple Unmanned Aerial Vehicles(UAVs),and reduce the communication and computational resource consumption of multiple UAVs,a Fraction...To solve the problem of in-flight actuator faults and parameter uncertainties for multiple Unmanned Aerial Vehicles(UAVs),and reduce the communication and computational resource consumption of multiple UAVs,a Fraction-Order(FO)sliding-mode Fault-Tolerant Cooperative Control(FTCC)strategy is proposed for multiple UAVs based on Event-Triggered Communication Mechanism(ET-COM-M)and Event-Triggered Control Mechanism(ET-CON-M).First,by considering the limited communication bandwidth of multiple UAVs in formation,an ET-COM-M is designed to significantly reduce communication times.Then,a distributed observer is skillfully constructed to estimate the reference signals for follower UAVs.Moreover,the adaptive strategy is incorporated into the Radial Basis Function Neural Network(RBFNN)to learn the lumped unknown terms for handling bias actuator faults and parameter uncertainties.Besides,the Nussbaum method is used to deal with the loss-of-effectiveness faults.To further achieve the refined control performance against faults,FO calculus is artfully integrated into the sliding-mode control protocol with ET-CON-M.Finally,Zeno behavior is excluded by rigorous theoretical analysis and Lyapunov stability is proved to show the effectiveness of the designed FTCC strategy.Simulation results show that the designed FTCC strategy with Event-Triggered Mechanism(ETM)can guarantee the safety of multiple UAVs and simultaneously reduce the communication and control frequencies,making the developed control scheme applicable in engineering.展开更多
This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external di...This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external disturbances.Under the directed topology conditions,an observer-based finite-time control strategy based on adaptive backstepping and is proposed,in which a neural network-based state observer is employed to approximate the unmeasurable system state variables.To address the complexity explosion problem associated with the backstepping method,a finite-time command filter is incorporated,with error compensation signals designed to mitigate the filter-induced errors.Additionally,the Butterworth low-pass filter is introduced to avoid the algebraic ring problem in the design of the controller.The finite-time stability of the closed-loop system is rigorously analyzed with the finite-time Lyapunov stability criterion,validating that all closed-loop signals of the system remain bounded within a finite time.Finally,the effectiveness of the proposed control strategy is verified through a simulation example.展开更多
Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making p...Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making problems,significantly enhancing swarm intelligence in maneuvering.However,applying MARL to unmanned swarms presents two primary challenges.First,defensive agents must balance autonomy with collaboration under limited perception while coordinating against adversaries.Second,current algorithms aim to maximize global or individual rewards,making them sensitive to fluctuations in enemy strategies and environmental changes,especially when rewards are sparse.To tackle these issues,we propose an algorithm of MultiAgent Reinforcement Learning with Layered Autonomy and Collaboration(MARL-LAC)for collaborative confrontations.This algorithm integrates dual twin Critics to mitigate the high variance associated with policy gradients.Furthermore,MARL-LAC employs layered autonomy and collaboration to address multi-objective problems,specifically learning a global reward function for the swarm alongside local reward functions for individual defensive agents.Experimental results demonstrate that MARL-LAC enhances decision-making and collaborative behaviors among agents,outperforming the existing algorithms and emphasizing the importance of layered autonomy and collaboration in multi-agent systems.The observed adversarial behaviors demonstrate that agents using MARL-LAC effectively maintain cohesive formations that conceal their intentions by confusing the offensive agent while successfully encircling the target.展开更多
This paper introduces a novel fractional-order model based on the Caputo-Fabrizio(CF)derivative for analyzing computer virus propagation in networked environments.The model partitions the computer population into four...This paper introduces a novel fractional-order model based on the Caputo-Fabrizio(CF)derivative for analyzing computer virus propagation in networked environments.The model partitions the computer population into four compartments:susceptible,latently infected,breaking-out,and antivirus-capable systems.By employing the CF derivative—which uses a nonsingular exponential kernel—the framework effectively captures memory-dependent and nonlocal characteristics intrinsic to cyber systems,aspects inadequately represented by traditional integer-order models.Under Lipschitz continuity and boundedness assumptions,the existence and uniqueness of solutions are rigorously established via fixed-point theory.We develop a tailored two-step Adams-Bashforth numerical scheme for the CF framework and prove its second-order accuracy.Extensive numerical simulations across various fractional orders reveal that memory effects significantly influence virus transmission and control dynamics;smaller fractional orders produce more pronounced memory effects,delaying both infection spread and antivirus activation.Further theoretical analysis,including Hyers-Ulam stability and sensitivity assessments,reinforces the model’s robustness and identifies key parameters governing virus dynamics.The study also extends the framework to incorporate stochastic effects through a stochastic CF formulation.These results underscore fractional-order modeling as a powerful analytical tool for developing robust and effective cybersecurity strategies.展开更多
Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal ...Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal alignment,temporal consistency,and robust handling of noisy or incomplete inputs across multiple modalities.We propose Multi Agent-Chain of Thought(CoT),a novel multi-agent chain-of-thought reasoning framework where specialized agents for text,vision,and speech modalities collaboratively construct shared reasoning traces through inter-agent message passing and consensus voting mechanisms.Our architecture incorporates self-reflection modules,conflict resolution protocols,and dynamic rationale alignment to enhance consistency,factual accuracy,and user engagement.The framework employs a hierarchical attention mechanism with cross-modal fusion and implements adaptive reasoning depth based on dialogue complexity.Comprehensive evaluations on Situated Interactive Multi-Modal Conversations(SIMMC)2.0,VisDial v1.0,and newly introduced challenging scenarios demonstrate statistically significant improvements in grounding accuracy(p<0.01),chain-of-thought interpretability,and robustness to adversarial inputs compared to state-of-the-art monolithic transformer baselines and existing multi-agent approaches.展开更多
To address the finite-time tracking control problem for fractional-order nonlinear systems(FONSs) with actuator faults and external disturbance,a novel strategy of the finite-time adaptive fuzzy fault-tolerant control...To address the finite-time tracking control problem for fractional-order nonlinear systems(FONSs) with actuator faults and external disturbance,a novel strategy of the finite-time adaptive fuzzy fault-tolerant controller is presented in this paper by utilizing the finite-time stability theory and fractional-order dynamic surface control scheme combined with backstepping method.A new lemma is developed for analyzing the finite-time stability of FONSs in terms of fractional differential inequality,which modifies some existing results.Fuzzy logic systems are adopted to identify unknown nonlinear characteristics in FONS.In order to compensate for the influence of unknown external disturbance and estimation error for fuzzy logic systems,an auxiliary function is employed to estimate the upper bound of parameters online.Furthermore,a global coordinate transformation is first introduced initially to decouple the fractional-order dynamic system of a specific class of underactuated single-link flexible manipulator systems,thereby transforming it into lower triangular systems.Simulation analyses and experimental results verify the feasibility and effectiveness of finite-time tracking control algorithm.展开更多
This paper presents Dual Adaptive Neural Topology(Dual ANT),a distributed dual-network metaadaptive framework that enhances ant-colony-based multi-agent coordination with online introspection,adaptive parameter contro...This paper presents Dual Adaptive Neural Topology(Dual ANT),a distributed dual-network metaadaptive framework that enhances ant-colony-based multi-agent coordination with online introspection,adaptive parameter control,and privacy-preserving interactions.This approach improves standard Ant Colony Optimization(ACO)with two lightweight neural components:a forward network that estimates swarm efficiency in real time and an inverse network that converts these descriptors into parameter adaptations.To preserve the privacy of individual trajectories in shared pheromone maps,we introduce a locally differentially private pheromone update mechanism that adds calibrated noise to each agent’s pheromone deposit while preserving the efficacy of the global pheromone signal.The resulting systemenables agents to dynamically and autonomously adapt their coordination strategies under challenging and dynamic conditions,including varying obstacle layouts,uncertain target locations,and time-varying disturbances.Extensive simulations of large grid-based search tasks demonstrated that Dual ANT achieved faster convergence,higher robustness,and improved scalability compared to advanced baselines such asMulti-StrategyACO and Hierarchical ACO.The meta-adaptive feedback loop compensates for the performance degradation caused by privacy noise and prevents premature stagnation by triggering Levy flight exploration only when necessary.展开更多
基金supported by the National Natural Science Foundation of China(61303211)Zhejiang Provincial Natural Science Foundation of China(LY17F030003,LY15F030009)
文摘In this paper, the leader-following tracking problem of fractional-order multi-agent systems is addressed. The dynamics of each agent may be heterogeneous and has unknown nonlinearities. By assumptions that the interaction topology is undirected and connected and the unknown nonlinear uncertain dynamics can be parameterized by a neural network, an adaptive learning law is proposed to deal with unknown nonlinear dynamics, based on which a kind of cooperative tracking protocols are constructed. The feedback gain matrix is obtained to solve an algebraic Riccati equation. To construct the fully distributed cooperative tracking protocols, the adaptive law is also adopted to adjust the coupling weight. With the developed control laws,we can prove that all signals in the closed-loop systems are guaranteed to be uniformly ultimately bounded. Finally, a simple simulation example is provided to illustrate the established result.
基金Project supported by the National Natural Science Foundation of China(Nos.11471230 and11671282)
文摘In this paper,we study some new fractional-order multi-agent systems with current and delay states (FMASCD).Using the generalized Nyquist's stability criterion and Gerschgorin's circle theorem,we obtain the bounded input-bounded output (BIBO) stability and asymptotical consensus of the FMASCD under mild conditions.Moreover,we give some numerical examples to illustrate our main results.
文摘Leader-following consensus of fractional order multi-agent systems is investigated. The agents are considered as discrete-time fractional order integrators or fractional order double-integrators. Moreover, the interaction between the agents is described with an undirected communication graph with a fixed topology. It is shown that the leader-following consensus problem for the considered agents could be converted to the asymptotic stability analysis of a discrete-time fractional order system. Based on this idea, sufficient conditions to reach the leader-following consensus in terms of the controller parameters are extracted. This leads to an appropriate region in the controller parameters space. Numerical simulations are provided to show the performance of the proposed leader-following consensus approach.
基金supported in part by the Natural Science Foundation of Shaanxi Province(2024JC-YBMS-451,2024JC-YBQN-0398).
文摘Aiming at the consensus of relative position considering obstacle avoidance for fractional-order multi-agent system,a novel distributed control algorithm is proposed in this paper.Firstly,a synthetic error of each agent under the influence of obstacles is introduced.The consensus pro-tocols are designed based on this eror according to sliding mode theory for the order increasing and decreasing,respectively.Then,the Lyapunov function is used to prove the stable convergence of the protocols.Finally,the simulation results show that the protocols can not only prevent the agents from colliding with obstacles,but also enable the agents to quickly recover the expected formation and achieve consensus of the relative position.
基金National Key R&D Program of China(2018YFA0702200)National Natural Science Foundation of China(61627809,62173080)Liaoning Revitalization Talents Program(XLYC1801005)。
文摘This paper investigates adaptive containment control for a class of fractional-order multi-agent systems(FOMASs)with time-varying parameters and disturbances.By using the bounded estimation method,the difficulty generated by the timevarying parameters and disturbances is overcome.The command filter is introduced to solve the complexity problem inherent in adaptive backstepping control.Meanwhile,in order to eliminate the effect of filter errors,a novel distributed error compensating scheme is constructed,in which only the local information from the neighbor agents is utilized.Then,a distributed adaptive containment control scheme for FOMASs is developed based on backstepping to guarantee that the outputs of all the followers are steered to the convex hull spanned by the leaders.Based on the extension of Barbalat's lemma to fractional-order integrals,it can be proven that the containment errors and the compensating signals have asymptotic convergence.Finally,three simulation examples are given to show the feasibility and effectiveness of the proposed control method.
基金supported in part by the National Natural Science Foundation of China(grants 62203073 and 62573068)the Natural Science Foundation of Chongqing,China(grant CSTB2022NSCQMSX0577)。
文摘Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused by the strong inter-agent coupling nature embedded in an MARL problem,which is yet to be fully exploited in existing algorithms.In this work,we recognize a learning graph characterizing the dependence between individual rewards and individual policies.Then we propose a graph-based reward aggregation(GRA)method,which utilizes the inherent coupling relationship among agents to eliminate redundant information.Specifically,GRA passes information among cooperating agents through graph attention networks to obtain aggregated rewards that contribute to the fitting of the value function,making each agent learn a decentralized executable cooperation policy.In addition,we propose a variant of GRA,named GRA-decen,which achieves decentralized training and decentralized execution(DTDE)when each agent only has access to information of partial agents in the learning process.We conduct experiments in different environments and demonstrate the practicality and scalability of our algorithms.
基金supported by the National Natural Science Foundation of China under Grants 61962023,61562029 and 62466019.
文摘This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination methods that are solved by neural dynamics,the proposed strategy displays greater flexibility,adaptability and scalability.Furthermore,the proposed AMAC strategy is reconstructed as a time-varying complex-valued matrix equation.By introducing a dynamic error function,a fixed-time convergent zeroing neural network(FTCZNN)model is designed for the online solution of the AMAC strategy,with its convergence time upper bound derived theoretically.Finally,the effectiveness and applicability of the coordination control method are demonstrated by numerical simulations and physical experiments.Numerical results indicate that this method can reduce the formation error to the order of 10^(-6)within 1.8 s.
文摘This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to estimate higher-order synchronization errors,enabling the controller to rely solely on relative output measurements.This approach significantly reduces the dependence on full-state information,which is often infeasible or costly in practical engineering applications.An output feedback control strategy is developed to overcome these limitations while ensuring robust and effective synchronization.Simulation results are provided to demonstrate the effectiveness of the proposed approach and validate the theoretical findings.
基金supported by the National Natural Science Foundation of China Youth Fund(No.62101579)。
文摘Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and collaboration.However,most existing unsupervised meta-learning methods are centralized and not suitable for multi-agent systems where data are distributed stored and inaccessible to all agents.Meta-GMVAE,based on Variational Autoencoder(VAE)and set-level variational inference,represents a sophisticated unsupervised meta-learning model that improves generative performance by efficiently learning data representations across various tasks,increasing adaptability and reducing sample requirements.Inspired by these advancements,we propose a novel Distributed Unsupervised Meta-Learning(DUML)framework based on Meta-GMVAE and a fusion strategy.Furthermore,we present a DUML algorithm based on Gaussian Mixture Model(DUMLGMM),where the parameters of the Gaussian-mixture are solved by an Expectation-Maximization algorithm.Simulations on Omniglot and Mini Image Net datasets show that DUMLGMM can achieve the performance of the corresponding centralized algorithm and outperform non-cooperative algorithm.
文摘In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education.
基金supported by the National Natural Science Foundation of China(62463007,62463005)the Natural Science Foundation of Hainan Province(625RC710,625MS047)+1 种基金the System Control and Information Processing Education Ministry Key Laboratory Open Funding,China(Scip20240119)the Science Research Funding of Hainan University,China(KYQD(ZR)22180,KYQD(ZR)23180).
文摘This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consensus under average dwell time switching.Then sufficient conditions are derived to guarantee the positive consensus.The gain matrices of the control protocol are described using a matrix decomposition approach and the corresponding computational complexity is reduced by resorting to linear programming and co-positive Lyapunov functions.Finally,two numerical examples are provided to illustrate the results obtained.
文摘With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics.
基金supported by the National Natural Science Foundation of China(No.52477097)the GuangDong Basic and Applied Basic Research Foundation(2023A1515240014)the State Key Laboratory of Advanced Electromagnetic Technology(Grant No.AET 2024KF005).
文摘To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on multiagent deep deterministic policy gradient(MaDDPG).The ECs are divided into conventional ECs and the electric vehicles(EVs)which are managed by ECs agent(ECA)and EV agent(EVA)to exploit the flexibility of the HDR framework.Thus,the HDR is a tri-layer model determined by five types of agents engaging in competing interaction to maximize their own profits.To address the limitations of mathematical expression and participation scale in the Stackelberg game within the HDR model,a dynamic interaction mechanism is adopted.Moreover,to tackle the HDR involving various entities,the MaDDPG develops multiple agents to simulation the dynamic competing interactions between each subject as well as solve the problem of continuous action control.Furthermore,MaDDPG adopts soft target update and priority experience replay method to ensure stable and effective training,and makes the exploration strategy comprehensive by using exploration noise.Simulation studies are conducted to verify the performance of the MaDDPG with dynamic interaction mechanism in dealing with multilayer multi-agent continuous action control,compared to the double deep Q network(DDQN),deep Q network(DQN)and dueling DQN.Additionally,comparisons among the proposed HDR with the price based DR(PBDR)and incentive based DR(IBDR)are analyzed to investigate the flexibility of the HDR.
基金supported in part by National Natural Science Foundation of China(Nos.62373188,62003162)the Natural Science Foundation of Jiangsu Province of China(Nos.BK20240182,BK20222012)+2 种基金the Industry-University Research Innovation Foundation for the Chinese Ministry of Education(No.2021ZYA02005)the Aeronautical Science Foundation of China(Nos.20220007052003,20200007018001)the Fundamental Research Funds for the Central Universities,China(Nos.NE2024004,NI2024001)。
文摘To solve the problem of in-flight actuator faults and parameter uncertainties for multiple Unmanned Aerial Vehicles(UAVs),and reduce the communication and computational resource consumption of multiple UAVs,a Fraction-Order(FO)sliding-mode Fault-Tolerant Cooperative Control(FTCC)strategy is proposed for multiple UAVs based on Event-Triggered Communication Mechanism(ET-COM-M)and Event-Triggered Control Mechanism(ET-CON-M).First,by considering the limited communication bandwidth of multiple UAVs in formation,an ET-COM-M is designed to significantly reduce communication times.Then,a distributed observer is skillfully constructed to estimate the reference signals for follower UAVs.Moreover,the adaptive strategy is incorporated into the Radial Basis Function Neural Network(RBFNN)to learn the lumped unknown terms for handling bias actuator faults and parameter uncertainties.Besides,the Nussbaum method is used to deal with the loss-of-effectiveness faults.To further achieve the refined control performance against faults,FO calculus is artfully integrated into the sliding-mode control protocol with ET-CON-M.Finally,Zeno behavior is excluded by rigorous theoretical analysis and Lyapunov stability is proved to show the effectiveness of the designed FTCC strategy.Simulation results show that the designed FTCC strategy with Event-Triggered Mechanism(ETM)can guarantee the safety of multiple UAVs and simultaneously reduce the communication and control frequencies,making the developed control scheme applicable in engineering.
基金supported in part by the Beijing Natural Science Foundation under Grant 4252050in part by the National Science Fund for Distinguished Young Scholars under Grant 62425304in part by the Basic Science Center Programs of NSFC under Grant 62088101.
文摘This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external disturbances.Under the directed topology conditions,an observer-based finite-time control strategy based on adaptive backstepping and is proposed,in which a neural network-based state observer is employed to approximate the unmeasurable system state variables.To address the complexity explosion problem associated with the backstepping method,a finite-time command filter is incorporated,with error compensation signals designed to mitigate the filter-induced errors.Additionally,the Butterworth low-pass filter is introduced to avoid the algebraic ring problem in the design of the controller.The finite-time stability of the closed-loop system is rigorously analyzed with the finite-time Lyapunov stability criterion,validating that all closed-loop signals of the system remain bounded within a finite time.Finally,the effectiveness of the proposed control strategy is verified through a simulation example.
基金co-supported by the National Natural Science Foundation of China(Nos.72371052 and 71871042).
文摘Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making problems,significantly enhancing swarm intelligence in maneuvering.However,applying MARL to unmanned swarms presents two primary challenges.First,defensive agents must balance autonomy with collaboration under limited perception while coordinating against adversaries.Second,current algorithms aim to maximize global or individual rewards,making them sensitive to fluctuations in enemy strategies and environmental changes,especially when rewards are sparse.To tackle these issues,we propose an algorithm of MultiAgent Reinforcement Learning with Layered Autonomy and Collaboration(MARL-LAC)for collaborative confrontations.This algorithm integrates dual twin Critics to mitigate the high variance associated with policy gradients.Furthermore,MARL-LAC employs layered autonomy and collaboration to address multi-objective problems,specifically learning a global reward function for the swarm alongside local reward functions for individual defensive agents.Experimental results demonstrate that MARL-LAC enhances decision-making and collaborative behaviors among agents,outperforming the existing algorithms and emphasizing the importance of layered autonomy and collaboration in multi-agent systems.The observed adversarial behaviors demonstrate that agents using MARL-LAC effectively maintain cohesive formations that conceal their intentions by confusing the offensive agent while successfully encircling the target.
基金supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)(grant number IMSIU-DDRSP2601).
文摘This paper introduces a novel fractional-order model based on the Caputo-Fabrizio(CF)derivative for analyzing computer virus propagation in networked environments.The model partitions the computer population into four compartments:susceptible,latently infected,breaking-out,and antivirus-capable systems.By employing the CF derivative—which uses a nonsingular exponential kernel—the framework effectively captures memory-dependent and nonlocal characteristics intrinsic to cyber systems,aspects inadequately represented by traditional integer-order models.Under Lipschitz continuity and boundedness assumptions,the existence and uniqueness of solutions are rigorously established via fixed-point theory.We develop a tailored two-step Adams-Bashforth numerical scheme for the CF framework and prove its second-order accuracy.Extensive numerical simulations across various fractional orders reveal that memory effects significantly influence virus transmission and control dynamics;smaller fractional orders produce more pronounced memory effects,delaying both infection spread and antivirus activation.Further theoretical analysis,including Hyers-Ulam stability and sensitivity assessments,reinforces the model’s robustness and identifies key parameters governing virus dynamics.The study also extends the framework to incorporate stochastic effects through a stochastic CF formulation.These results underscore fractional-order modeling as a powerful analytical tool for developing robust and effective cybersecurity strategies.
文摘Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal alignment,temporal consistency,and robust handling of noisy or incomplete inputs across multiple modalities.We propose Multi Agent-Chain of Thought(CoT),a novel multi-agent chain-of-thought reasoning framework where specialized agents for text,vision,and speech modalities collaboratively construct shared reasoning traces through inter-agent message passing and consensus voting mechanisms.Our architecture incorporates self-reflection modules,conflict resolution protocols,and dynamic rationale alignment to enhance consistency,factual accuracy,and user engagement.The framework employs a hierarchical attention mechanism with cross-modal fusion and implements adaptive reasoning depth based on dialogue complexity.Comprehensive evaluations on Situated Interactive Multi-Modal Conversations(SIMMC)2.0,VisDial v1.0,and newly introduced challenging scenarios demonstrate statistically significant improvements in grounding accuracy(p<0.01),chain-of-thought interpretability,and robustness to adversarial inputs compared to state-of-the-art monolithic transformer baselines and existing multi-agent approaches.
基金supported by the National Natural Science Foundation of China(62403340,62303339)Sichuan Science and Technology Program(2026NSFSC1518)+2 种基金China Postdoctoral Science Foundation(CPSF)(2025T180940,2024M762208)Postdoctoral Fellowship Program of CPSF(GZC20231783)Guangxi Key Laboratory of Brain-Inspired Computing and Intelligent Chips(BCIC-24-K2)。
文摘To address the finite-time tracking control problem for fractional-order nonlinear systems(FONSs) with actuator faults and external disturbance,a novel strategy of the finite-time adaptive fuzzy fault-tolerant controller is presented in this paper by utilizing the finite-time stability theory and fractional-order dynamic surface control scheme combined with backstepping method.A new lemma is developed for analyzing the finite-time stability of FONSs in terms of fractional differential inequality,which modifies some existing results.Fuzzy logic systems are adopted to identify unknown nonlinear characteristics in FONS.In order to compensate for the influence of unknown external disturbance and estimation error for fuzzy logic systems,an auxiliary function is employed to estimate the upper bound of parameters online.Furthermore,a global coordinate transformation is first introduced initially to decouple the fractional-order dynamic system of a specific class of underactuated single-link flexible manipulator systems,thereby transforming it into lower triangular systems.Simulation analyses and experimental results verify the feasibility and effectiveness of finite-time tracking control algorithm.
基金funded by the Deanship of Scientific Research at Northern Border University,Arar,Saudi Arabia,under project number NBU-FFR-2026-2441-02.
文摘This paper presents Dual Adaptive Neural Topology(Dual ANT),a distributed dual-network metaadaptive framework that enhances ant-colony-based multi-agent coordination with online introspection,adaptive parameter control,and privacy-preserving interactions.This approach improves standard Ant Colony Optimization(ACO)with two lightweight neural components:a forward network that estimates swarm efficiency in real time and an inverse network that converts these descriptors into parameter adaptations.To preserve the privacy of individual trajectories in shared pheromone maps,we introduce a locally differentially private pheromone update mechanism that adds calibrated noise to each agent’s pheromone deposit while preserving the efficacy of the global pheromone signal.The resulting systemenables agents to dynamically and autonomously adapt their coordination strategies under challenging and dynamic conditions,including varying obstacle layouts,uncertain target locations,and time-varying disturbances.Extensive simulations of large grid-based search tasks demonstrated that Dual ANT achieved faster convergence,higher robustness,and improved scalability compared to advanced baselines such asMulti-StrategyACO and Hierarchical ACO.The meta-adaptive feedback loop compensates for the performance degradation caused by privacy noise and prevents premature stagnation by triggering Levy flight exploration only when necessary.