Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune s...Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune system (BIS) according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system (MAS), the vaccine and the danger theory (DT). Moreover a new immune multi-agent model using vaccine based on DT (IMMUVBDT) for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system.展开更多
In this paper,the multi-agent model about shop logistics is set up.This model has 8 agents:raw materials stock agent,process agent,testing agent,transition agent,production information agent,scheduling agent,process a...In this paper,the multi-agent model about shop logistics is set up.This model has 8 agents:raw materials stock agent,process agent,testing agent,transition agent,production information agent,scheduling agent,process agent and stock agent.The scheduling agent has three subagents:manager agent(MA),resource agent(RA)and part agent(PA).MA,PA and RA are communicating equally that guarantees agility of the whole MAS system.The part tasks pass between MA,RA and PA as an integer,which can guarantee the consistency of the data.We use a detailed example about shop logistics scheduling in a semiconductor company to explain the principle.In this example,we use two scheduling strategies:FCFS and SPT.The result data indicates that the average flow time and lingering ratio are changed using different strategy.It is proves that the multi-agent scheduling is useful.展开更多
In this study, three-phase satellite images were used to define rules for the allocation of time and space in construction land resources based on a complex adaptive system and game theory. The decision behavior and r...In this study, three-phase satellite images were used to define rules for the allocation of time and space in construction land resources based on a complex adaptive system and game theory. The decision behavior and rules of government agent, enterprise agent and resident agent in construction land growth were explored. A distinctive and dynamic simulation model of construction land growth was built, which integrated multi-agent, GIS technology and RS data and described the interaction among influencing agents, Taking Fuyang City in the Changjiang River Delta as an example, an assessment process for the remote sensing data in construction land and scenario planning was constructed. Repast and ArcGIS were used as simulation platforms. A simulation of the spatial pattern in land-use planning and the setting of scenario planning were conducted by using the incomplete active game, which was based on different natural, social and economic levels. Through this model, a simulation of urban planning space and decision-making for Fuyang City was created. Relevant non-structured problems arising from urban planning management could be identified, and the process and logic of urban planning spatial decision-making could thus be improved. Cell-by-cell comparison showed that the simulation accuracy was over 72%. This model has great potential for use by government and town planners in decision support and technique support in the policy-making process.展开更多
Because of the anonymity and openness of online transactions and the richness of network resources, the problems of the credibility of the online trading and the exact selection of network resources have become acute....Because of the anonymity and openness of online transactions and the richness of network resources, the problems of the credibility of the online trading and the exact selection of network resources have become acute. For this reason, a reputation-based multi-agent model for network resource selection (RMNRS) is presented. The model divides the network into numbers of trust domains. Each domain has one domain-agent and several entity-agents. The model prevents the inconsistency of information that is maintained by differ-ent agents through the periodically communication between the agents. The model enables the consumers to receive responses from agents significantly quicker than that of traditional models, because the global reputation values of service providers and consumers are evaluated and updated dynamically after each transaction. And the model allocates two global reputation values to each entity and takes the recognition value that how much the service provider knows the service into account. In order to make users choose the best matching services and give users with trusted services, the model also takes the similarity between services into account and uses the similarity degree to amend the integration reputation value with harmonic-mean. Finally, the effectiveness and feasibility of this model is illustrated by the experiment.展开更多
Because of the anonymity and openness of E-commerce, the on-line transaction and the selection of network resources meet new challenges. For this reason, a trust domain-based multi-agent model for network resource sel...Because of the anonymity and openness of E-commerce, the on-line transaction and the selection of network resources meet new challenges. For this reason, a trust domain-based multi-agent model for network resource selection is presented. The model divides the network into numbers of trust domains and prevents the inconsistency of information maintained by different agents through the periodical communication among the agents. The model enables consumers to receive the response from the agents much quicker because the trust values of participators are evalUated and updated dynamically and timely after the completion of each transaction. In order to make users choose the best matching services and give users with trusted services, the model takes into account the similarity between services and the service providers' recognition to the services. Finally, the model illustrates the effectiveness and feasibility according to the experiment.展开更多
Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal ...Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal alignment,temporal consistency,and robust handling of noisy or incomplete inputs across multiple modalities.We propose Multi Agent-Chain of Thought(CoT),a novel multi-agent chain-of-thought reasoning framework where specialized agents for text,vision,and speech modalities collaboratively construct shared reasoning traces through inter-agent message passing and consensus voting mechanisms.Our architecture incorporates self-reflection modules,conflict resolution protocols,and dynamic rationale alignment to enhance consistency,factual accuracy,and user engagement.The framework employs a hierarchical attention mechanism with cross-modal fusion and implements adaptive reasoning depth based on dialogue complexity.Comprehensive evaluations on Situated Interactive Multi-Modal Conversations(SIMMC)2.0,VisDial v1.0,and newly introduced challenging scenarios demonstrate statistically significant improvements in grounding accuracy(p<0.01),chain-of-thought interpretability,and robustness to adversarial inputs compared to state-of-the-art monolithic transformer baselines and existing multi-agent approaches.展开更多
Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and colla...Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and collaboration.However,most existing unsupervised meta-learning methods are centralized and not suitable for multi-agent systems where data are distributed stored and inaccessible to all agents.Meta-GMVAE,based on Variational Autoencoder(VAE)and set-level variational inference,represents a sophisticated unsupervised meta-learning model that improves generative performance by efficiently learning data representations across various tasks,increasing adaptability and reducing sample requirements.Inspired by these advancements,we propose a novel Distributed Unsupervised Meta-Learning(DUML)framework based on Meta-GMVAE and a fusion strategy.Furthermore,we present a DUML algorithm based on Gaussian Mixture Model(DUMLGMM),where the parameters of the Gaussian-mixture are solved by an Expectation-Maximization algorithm.Simulations on Omniglot and Mini Image Net datasets show that DUMLGMM can achieve the performance of the corresponding centralized algorithm and outperform non-cooperative algorithm.展开更多
Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused ...Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused by the strong inter-agent coupling nature embedded in an MARL problem,which is yet to be fully exploited in existing algorithms.In this work,we recognize a learning graph characterizing the dependence between individual rewards and individual policies.Then we propose a graph-based reward aggregation(GRA)method,which utilizes the inherent coupling relationship among agents to eliminate redundant information.Specifically,GRA passes information among cooperating agents through graph attention networks to obtain aggregated rewards that contribute to the fitting of the value function,making each agent learn a decentralized executable cooperation policy.In addition,we propose a variant of GRA,named GRA-decen,which achieves decentralized training and decentralized execution(DTDE)when each agent only has access to information of partial agents in the learning process.We conduct experiments in different environments and demonstrate the practicality and scalability of our algorithms.展开更多
This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination me...This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination methods that are solved by neural dynamics,the proposed strategy displays greater flexibility,adaptability and scalability.Furthermore,the proposed AMAC strategy is reconstructed as a time-varying complex-valued matrix equation.By introducing a dynamic error function,a fixed-time convergent zeroing neural network(FTCZNN)model is designed for the online solution of the AMAC strategy,with its convergence time upper bound derived theoretically.Finally,the effectiveness and applicability of the coordination control method are demonstrated by numerical simulations and physical experiments.Numerical results indicate that this method can reduce the formation error to the order of 10^(-6)within 1.8 s.展开更多
This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to es...This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to estimate higher-order synchronization errors,enabling the controller to rely solely on relative output measurements.This approach significantly reduces the dependence on full-state information,which is often infeasible or costly in practical engineering applications.An output feedback control strategy is developed to overcome these limitations while ensuring robust and effective synchronization.Simulation results are provided to demonstrate the effectiveness of the proposed approach and validate the theoretical findings.展开更多
This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consen...This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consensus under average dwell time switching.Then sufficient conditions are derived to guarantee the positive consensus.The gain matrices of the control protocol are described using a matrix decomposition approach and the corresponding computational complexity is reduced by resorting to linear programming and co-positive Lyapunov functions.Finally,two numerical examples are provided to illustrate the results obtained.展开更多
In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Mu...In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education.展开更多
With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier...With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics.展开更多
To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on m...To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on multiagent deep deterministic policy gradient(MaDDPG).The ECs are divided into conventional ECs and the electric vehicles(EVs)which are managed by ECs agent(ECA)and EV agent(EVA)to exploit the flexibility of the HDR framework.Thus,the HDR is a tri-layer model determined by five types of agents engaging in competing interaction to maximize their own profits.To address the limitations of mathematical expression and participation scale in the Stackelberg game within the HDR model,a dynamic interaction mechanism is adopted.Moreover,to tackle the HDR involving various entities,the MaDDPG develops multiple agents to simulation the dynamic competing interactions between each subject as well as solve the problem of continuous action control.Furthermore,MaDDPG adopts soft target update and priority experience replay method to ensure stable and effective training,and makes the exploration strategy comprehensive by using exploration noise.Simulation studies are conducted to verify the performance of the MaDDPG with dynamic interaction mechanism in dealing with multilayer multi-agent continuous action control,compared to the double deep Q network(DDQN),deep Q network(DQN)and dueling DQN.Additionally,comparisons among the proposed HDR with the price based DR(PBDR)and incentive based DR(IBDR)are analyzed to investigate the flexibility of the HDR.展开更多
This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external di...This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external disturbances.Under the directed topology conditions,an observer-based finite-time control strategy based on adaptive backstepping and is proposed,in which a neural network-based state observer is employed to approximate the unmeasurable system state variables.To address the complexity explosion problem associated with the backstepping method,a finite-time command filter is incorporated,with error compensation signals designed to mitigate the filter-induced errors.Additionally,the Butterworth low-pass filter is introduced to avoid the algebraic ring problem in the design of the controller.The finite-time stability of the closed-loop system is rigorously analyzed with the finite-time Lyapunov stability criterion,validating that all closed-loop signals of the system remain bounded within a finite time.Finally,the effectiveness of the proposed control strategy is verified through a simulation example.展开更多
Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making p...Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making problems,significantly enhancing swarm intelligence in maneuvering.However,applying MARL to unmanned swarms presents two primary challenges.First,defensive agents must balance autonomy with collaboration under limited perception while coordinating against adversaries.Second,current algorithms aim to maximize global or individual rewards,making them sensitive to fluctuations in enemy strategies and environmental changes,especially when rewards are sparse.To tackle these issues,we propose an algorithm of MultiAgent Reinforcement Learning with Layered Autonomy and Collaboration(MARL-LAC)for collaborative confrontations.This algorithm integrates dual twin Critics to mitigate the high variance associated with policy gradients.Furthermore,MARL-LAC employs layered autonomy and collaboration to address multi-objective problems,specifically learning a global reward function for the swarm alongside local reward functions for individual defensive agents.Experimental results demonstrate that MARL-LAC enhances decision-making and collaborative behaviors among agents,outperforming the existing algorithms and emphasizing the importance of layered autonomy and collaboration in multi-agent systems.The observed adversarial behaviors demonstrate that agents using MARL-LAC effectively maintain cohesive formations that conceal their intentions by confusing the offensive agent while successfully encircling the target.展开更多
In many regions both urban expansion and rural development take place simultaneously, and for the purpose of understanding the dynamic process of land use/cover change (LUCC) in such large areas, this study develops...In many regions both urban expansion and rural development take place simultaneously, and for the purpose of understanding the dynamic process of land use/cover change (LUCC) in such large areas, this study develops a multi-agent based land use model. Taking the Poyang Lake area of China as a typical case, this study applies the mechanism of diffusion-limited aggregation to simulate the behavior of urban agents, while rural land use is illustrated with a bottom-up based model consisting of agent and environment layers. In the agent layer, each household agent makes its own decisions on land use, and at each time interval a government agent takes control of land use by implementing policies. According to incomes and the rate of migrant workers, household agents are divided into six categories, among which different decision rules are followed. For complex LUCC in the Poyang Lake area of China from 1985 to 2005, the artificial society model developed in this study yields results highly consistent with observations. Importantly, it is shown that governmental policies can impose significant effects on the decisions of individual household agents on land use and the multi-agent-based land use model developed in this study provides a robust means for assessing the effectiveness of governmental policies.展开更多
This paper presents Dual Adaptive Neural Topology(Dual ANT),a distributed dual-network metaadaptive framework that enhances ant-colony-based multi-agent coordination with online introspection,adaptive parameter contro...This paper presents Dual Adaptive Neural Topology(Dual ANT),a distributed dual-network metaadaptive framework that enhances ant-colony-based multi-agent coordination with online introspection,adaptive parameter control,and privacy-preserving interactions.This approach improves standard Ant Colony Optimization(ACO)with two lightweight neural components:a forward network that estimates swarm efficiency in real time and an inverse network that converts these descriptors into parameter adaptations.To preserve the privacy of individual trajectories in shared pheromone maps,we introduce a locally differentially private pheromone update mechanism that adds calibrated noise to each agent’s pheromone deposit while preserving the efficacy of the global pheromone signal.The resulting systemenables agents to dynamically and autonomously adapt their coordination strategies under challenging and dynamic conditions,including varying obstacle layouts,uncertain target locations,and time-varying disturbances.Extensive simulations of large grid-based search tasks demonstrated that Dual ANT achieved faster convergence,higher robustness,and improved scalability compared to advanced baselines such asMulti-StrategyACO and Hierarchical ACO.The meta-adaptive feedback loop compensates for the performance degradation caused by privacy noise and prevents premature stagnation by triggering Levy flight exploration only when necessary.展开更多
The wireless cloud robotic system(WCRS),which fully integrates sensing,communication,computing,and control capabilities as an intelligent agent,is a promising way to achieve intelligent manufacturing due to easy deplo...The wireless cloud robotic system(WCRS),which fully integrates sensing,communication,computing,and control capabilities as an intelligent agent,is a promising way to achieve intelligent manufacturing due to easy deployment and flexible expansion.However,the high-precision control of WCRS requires deterministic wireless communication,which is always challenging in the complex and dynamic radio space.This paper employs the reconfigurable intelligent surface(RIS)to establish a novel RIS-assisted WCRS architecture,where the radio channel is controlled to achieve ultra-reliable,low-delay,and low-jitter communication for high-precision closed-loop motion control.However,control and communication are strongly coupled and should be co-optimized.Fully considering the constraints of control input threshold,control delay deadline,beam phase,antenna power,and information distortion,we establish a stability maximization problem to jointly optimize control input compensation,RIS phase shift,and beamforming.Herein,a new jitter-oriented system stability objective with respect to control error and communication jitter is defined and the closed-form expression of control delay deadline is derived based on the Jensen Inequality and Lyapunov-Krasovskii functional.Due to the time-varying and partial observability of the channel and robot states,we model the problem as a partially observable Markov decision process(POMDP).To solve this complex problem,we propose a multi-agent transfer reinforcement learning algorithm named LSTM-PPO-MATRL,where the LSTM-enhanced proximal policy optimization(PPO)is designed to approximate an optimal solution and the option-guided policy transfer learning is proposed to facilitate the learning process.By centralized training and decentralized execution,LSTM-PPO-MATRL is validated by extensive experiments on MuJoCo tasks for both low-mobility and high-mobility robotic control scenarios.The results demonstrate that LSTM-PPO-MATRL not only realizes high learning efficiency,but also supports low-delay,low-jitter communication for low error control,where 71.9%control accuracy improvement and 68.7%delay jitter reduction are achieved compared to the PPO-MADRL baseline.展开更多
With the increasing role of social media in information dissemination,effectively simulating and analyzing public event dynamics has become a key research focus.We present an interactive visual analysis system for sim...With the increasing role of social media in information dissemination,effectively simulating and analyzing public event dynamics has become a key research focus.We present an interactive visual analysis system for simulating social media events using multi-agent models powered by large language models(LLMs).By modeling agents with diverse characteristics,the system explores how agents perceive information,adjust their emotions and stances,provide feedback,and influence the trajectory of events.The system integrates real-time interactive simulation with multi-perspective visualization,enabling users to investigate event trajectories and key influencing factors under varied configurations.Theoretical work standardizes agent attributes and interaction mechanisms,supporting realistic simulation of social media behaviors.Evaluation through indicators and case studies demonstrates the system’s effectiveness and adaptability,offering a novel tool for public event analysis across open social platforms.展开更多
文摘Aiming at the problem on cooperative air-defense of surface warship formation, this paper maps the cooperative airdefense system of systems (SoS) for surface warship formation (CASoSSWF) to the biological immune system (BIS) according to the similarity of the defense mechanism and characteristics between the CASoSSWF and the BIS, and then designs the models of components and the architecture for a monitoring agent, a regulating agent, a killer agent, a pre-warning agent and a communicating agent by making use of the theories and methods of the artificial immune system, the multi-agent system (MAS), the vaccine and the danger theory (DT). Moreover a new immune multi-agent model using vaccine based on DT (IMMUVBDT) for the cooperative air-defense SoS is advanced. The immune response and immune mechanism of the CASoSSWF are analyzed. The model has a capability of memory, evolution, commendable dynamic environment adaptability and self-learning, and embodies adequately the cooperative air-defense mechanism for the CASoSSWF. Therefore it shows a novel idea for the CASoSSWF which can provide conception models for a surface warship formation operation simulation system.
基金Supported by the Zhejiang Province Science Foundation of China(M703022)
文摘In this paper,the multi-agent model about shop logistics is set up.This model has 8 agents:raw materials stock agent,process agent,testing agent,transition agent,production information agent,scheduling agent,process agent and stock agent.The scheduling agent has three subagents:manager agent(MA),resource agent(RA)and part agent(PA).MA,PA and RA are communicating equally that guarantees agility of the whole MAS system.The part tasks pass between MA,RA and PA as an integer,which can guarantee the consistency of the data.We use a detailed example about shop logistics scheduling in a semiconductor company to explain the principle.In this example,we use two scheduling strategies:FCFS and SPT.The result data indicates that the average flow time and lingering ratio are changed using different strategy.It is proves that the multi-agent scheduling is useful.
基金Under the auspices of National Science and Technology Support Program of China(No.2012BAH29B04-00)
文摘In this study, three-phase satellite images were used to define rules for the allocation of time and space in construction land resources based on a complex adaptive system and game theory. The decision behavior and rules of government agent, enterprise agent and resident agent in construction land growth were explored. A distinctive and dynamic simulation model of construction land growth was built, which integrated multi-agent, GIS technology and RS data and described the interaction among influencing agents, Taking Fuyang City in the Changjiang River Delta as an example, an assessment process for the remote sensing data in construction land and scenario planning was constructed. Repast and ArcGIS were used as simulation platforms. A simulation of the spatial pattern in land-use planning and the setting of scenario planning were conducted by using the incomplete active game, which was based on different natural, social and economic levels. Through this model, a simulation of urban planning space and decision-making for Fuyang City was created. Relevant non-structured problems arising from urban planning management could be identified, and the process and logic of urban planning spatial decision-making could thus be improved. Cell-by-cell comparison showed that the simulation accuracy was over 72%. This model has great potential for use by government and town planners in decision support and technique support in the policy-making process.
文摘Because of the anonymity and openness of online transactions and the richness of network resources, the problems of the credibility of the online trading and the exact selection of network resources have become acute. For this reason, a reputation-based multi-agent model for network resource selection (RMNRS) is presented. The model divides the network into numbers of trust domains. Each domain has one domain-agent and several entity-agents. The model prevents the inconsistency of information that is maintained by differ-ent agents through the periodically communication between the agents. The model enables the consumers to receive responses from agents significantly quicker than that of traditional models, because the global reputation values of service providers and consumers are evaluated and updated dynamically after each transaction. And the model allocates two global reputation values to each entity and takes the recognition value that how much the service provider knows the service into account. In order to make users choose the best matching services and give users with trusted services, the model also takes the similarity between services into account and uses the similarity degree to amend the integration reputation value with harmonic-mean. Finally, the effectiveness and feasibility of this model is illustrated by the experiment.
基金Supported by the National Natural Science Foundation of China (No. 60873203 ), the Natural Science Foundation of Hebei Province (No F2008000646) and the Guidance Program of the Department of Science and Technology in Hebei Province (No. 72135192).
文摘Because of the anonymity and openness of E-commerce, the on-line transaction and the selection of network resources meet new challenges. For this reason, a trust domain-based multi-agent model for network resource selection is presented. The model divides the network into numbers of trust domains and prevents the inconsistency of information maintained by different agents through the periodical communication among the agents. The model enables consumers to receive the response from the agents much quicker because the trust values of participators are evalUated and updated dynamically and timely after the completion of each transaction. In order to make users choose the best matching services and give users with trusted services, the model takes into account the similarity between services and the service providers' recognition to the services. Finally, the model illustrates the effectiveness and feasibility according to the experiment.
文摘Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal alignment,temporal consistency,and robust handling of noisy or incomplete inputs across multiple modalities.We propose Multi Agent-Chain of Thought(CoT),a novel multi-agent chain-of-thought reasoning framework where specialized agents for text,vision,and speech modalities collaboratively construct shared reasoning traces through inter-agent message passing and consensus voting mechanisms.Our architecture incorporates self-reflection modules,conflict resolution protocols,and dynamic rationale alignment to enhance consistency,factual accuracy,and user engagement.The framework employs a hierarchical attention mechanism with cross-modal fusion and implements adaptive reasoning depth based on dialogue complexity.Comprehensive evaluations on Situated Interactive Multi-Modal Conversations(SIMMC)2.0,VisDial v1.0,and newly introduced challenging scenarios demonstrate statistically significant improvements in grounding accuracy(p<0.01),chain-of-thought interpretability,and robustness to adversarial inputs compared to state-of-the-art monolithic transformer baselines and existing multi-agent approaches.
基金supported by the National Natural Science Foundation of China Youth Fund(No.62101579)。
文摘Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and collaboration.However,most existing unsupervised meta-learning methods are centralized and not suitable for multi-agent systems where data are distributed stored and inaccessible to all agents.Meta-GMVAE,based on Variational Autoencoder(VAE)and set-level variational inference,represents a sophisticated unsupervised meta-learning model that improves generative performance by efficiently learning data representations across various tasks,increasing adaptability and reducing sample requirements.Inspired by these advancements,we propose a novel Distributed Unsupervised Meta-Learning(DUML)framework based on Meta-GMVAE and a fusion strategy.Furthermore,we present a DUML algorithm based on Gaussian Mixture Model(DUMLGMM),where the parameters of the Gaussian-mixture are solved by an Expectation-Maximization algorithm.Simulations on Omniglot and Mini Image Net datasets show that DUMLGMM can achieve the performance of the corresponding centralized algorithm and outperform non-cooperative algorithm.
基金supported in part by the National Natural Science Foundation of China(grants 62203073 and 62573068)the Natural Science Foundation of Chongqing,China(grant CSTB2022NSCQMSX0577)。
文摘Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused by the strong inter-agent coupling nature embedded in an MARL problem,which is yet to be fully exploited in existing algorithms.In this work,we recognize a learning graph characterizing the dependence between individual rewards and individual policies.Then we propose a graph-based reward aggregation(GRA)method,which utilizes the inherent coupling relationship among agents to eliminate redundant information.Specifically,GRA passes information among cooperating agents through graph attention networks to obtain aggregated rewards that contribute to the fitting of the value function,making each agent learn a decentralized executable cooperation policy.In addition,we propose a variant of GRA,named GRA-decen,which achieves decentralized training and decentralized execution(DTDE)when each agent only has access to information of partial agents in the learning process.We conduct experiments in different environments and demonstrate the practicality and scalability of our algorithms.
基金supported by the National Natural Science Foundation of China under Grants 61962023,61562029 and 62466019.
文摘This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination methods that are solved by neural dynamics,the proposed strategy displays greater flexibility,adaptability and scalability.Furthermore,the proposed AMAC strategy is reconstructed as a time-varying complex-valued matrix equation.By introducing a dynamic error function,a fixed-time convergent zeroing neural network(FTCZNN)model is designed for the online solution of the AMAC strategy,with its convergence time upper bound derived theoretically.Finally,the effectiveness and applicability of the coordination control method are demonstrated by numerical simulations and physical experiments.Numerical results indicate that this method can reduce the formation error to the order of 10^(-6)within 1.8 s.
文摘This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to estimate higher-order synchronization errors,enabling the controller to rely solely on relative output measurements.This approach significantly reduces the dependence on full-state information,which is often infeasible or costly in practical engineering applications.An output feedback control strategy is developed to overcome these limitations while ensuring robust and effective synchronization.Simulation results are provided to demonstrate the effectiveness of the proposed approach and validate the theoretical findings.
基金supported by the National Natural Science Foundation of China(62463007,62463005)the Natural Science Foundation of Hainan Province(625RC710,625MS047)+1 种基金the System Control and Information Processing Education Ministry Key Laboratory Open Funding,China(Scip20240119)the Science Research Funding of Hainan University,China(KYQD(ZR)22180,KYQD(ZR)23180).
文摘This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consensus under average dwell time switching.Then sufficient conditions are derived to guarantee the positive consensus.The gain matrices of the control protocol are described using a matrix decomposition approach and the corresponding computational complexity is reduced by resorting to linear programming and co-positive Lyapunov functions.Finally,two numerical examples are provided to illustrate the results obtained.
文摘In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education.
文摘With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics.
基金supported by the National Natural Science Foundation of China(No.52477097)the GuangDong Basic and Applied Basic Research Foundation(2023A1515240014)the State Key Laboratory of Advanced Electromagnetic Technology(Grant No.AET 2024KF005).
文摘To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on multiagent deep deterministic policy gradient(MaDDPG).The ECs are divided into conventional ECs and the electric vehicles(EVs)which are managed by ECs agent(ECA)and EV agent(EVA)to exploit the flexibility of the HDR framework.Thus,the HDR is a tri-layer model determined by five types of agents engaging in competing interaction to maximize their own profits.To address the limitations of mathematical expression and participation scale in the Stackelberg game within the HDR model,a dynamic interaction mechanism is adopted.Moreover,to tackle the HDR involving various entities,the MaDDPG develops multiple agents to simulation the dynamic competing interactions between each subject as well as solve the problem of continuous action control.Furthermore,MaDDPG adopts soft target update and priority experience replay method to ensure stable and effective training,and makes the exploration strategy comprehensive by using exploration noise.Simulation studies are conducted to verify the performance of the MaDDPG with dynamic interaction mechanism in dealing with multilayer multi-agent continuous action control,compared to the double deep Q network(DDQN),deep Q network(DQN)and dueling DQN.Additionally,comparisons among the proposed HDR with the price based DR(PBDR)and incentive based DR(IBDR)are analyzed to investigate the flexibility of the HDR.
基金supported in part by the Beijing Natural Science Foundation under Grant 4252050in part by the National Science Fund for Distinguished Young Scholars under Grant 62425304in part by the Basic Science Center Programs of NSFC under Grant 62088101.
文摘This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external disturbances.Under the directed topology conditions,an observer-based finite-time control strategy based on adaptive backstepping and is proposed,in which a neural network-based state observer is employed to approximate the unmeasurable system state variables.To address the complexity explosion problem associated with the backstepping method,a finite-time command filter is incorporated,with error compensation signals designed to mitigate the filter-induced errors.Additionally,the Butterworth low-pass filter is introduced to avoid the algebraic ring problem in the design of the controller.The finite-time stability of the closed-loop system is rigorously analyzed with the finite-time Lyapunov stability criterion,validating that all closed-loop signals of the system remain bounded within a finite time.Finally,the effectiveness of the proposed control strategy is verified through a simulation example.
基金co-supported by the National Natural Science Foundation of China(Nos.72371052 and 71871042).
文摘Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making problems,significantly enhancing swarm intelligence in maneuvering.However,applying MARL to unmanned swarms presents two primary challenges.First,defensive agents must balance autonomy with collaboration under limited perception while coordinating against adversaries.Second,current algorithms aim to maximize global or individual rewards,making them sensitive to fluctuations in enemy strategies and environmental changes,especially when rewards are sparse.To tackle these issues,we propose an algorithm of MultiAgent Reinforcement Learning with Layered Autonomy and Collaboration(MARL-LAC)for collaborative confrontations.This algorithm integrates dual twin Critics to mitigate the high variance associated with policy gradients.Furthermore,MARL-LAC employs layered autonomy and collaboration to address multi-objective problems,specifically learning a global reward function for the swarm alongside local reward functions for individual defensive agents.Experimental results demonstrate that MARL-LAC enhances decision-making and collaborative behaviors among agents,outperforming the existing algorithms and emphasizing the importance of layered autonomy and collaboration in multi-agent systems.The observed adversarial behaviors demonstrate that agents using MARL-LAC effectively maintain cohesive formations that conceal their intentions by confusing the offensive agent while successfully encircling the target.
基金Chinese R&D Program of "Development of a comprehensive monitoring and evaluation system for ecological compensation of typical ecologically vulnerable regions of China (2006BAC08B06)"National Science Fund for Distinguished Young Scholars (40788001)One Hundred Talents Program of the Chinese Academy of Sciences
文摘In many regions both urban expansion and rural development take place simultaneously, and for the purpose of understanding the dynamic process of land use/cover change (LUCC) in such large areas, this study develops a multi-agent based land use model. Taking the Poyang Lake area of China as a typical case, this study applies the mechanism of diffusion-limited aggregation to simulate the behavior of urban agents, while rural land use is illustrated with a bottom-up based model consisting of agent and environment layers. In the agent layer, each household agent makes its own decisions on land use, and at each time interval a government agent takes control of land use by implementing policies. According to incomes and the rate of migrant workers, household agents are divided into six categories, among which different decision rules are followed. For complex LUCC in the Poyang Lake area of China from 1985 to 2005, the artificial society model developed in this study yields results highly consistent with observations. Importantly, it is shown that governmental policies can impose significant effects on the decisions of individual household agents on land use and the multi-agent-based land use model developed in this study provides a robust means for assessing the effectiveness of governmental policies.
基金funded by the Deanship of Scientific Research at Northern Border University,Arar,Saudi Arabia,under project number NBU-FFR-2026-2441-02.
文摘This paper presents Dual Adaptive Neural Topology(Dual ANT),a distributed dual-network metaadaptive framework that enhances ant-colony-based multi-agent coordination with online introspection,adaptive parameter control,and privacy-preserving interactions.This approach improves standard Ant Colony Optimization(ACO)with two lightweight neural components:a forward network that estimates swarm efficiency in real time and an inverse network that converts these descriptors into parameter adaptations.To preserve the privacy of individual trajectories in shared pheromone maps,we introduce a locally differentially private pheromone update mechanism that adds calibrated noise to each agent’s pheromone deposit while preserving the efficacy of the global pheromone signal.The resulting systemenables agents to dynamically and autonomously adapt their coordination strategies under challenging and dynamic conditions,including varying obstacle layouts,uncertain target locations,and time-varying disturbances.Extensive simulations of large grid-based search tasks demonstrated that Dual ANT achieved faster convergence,higher robustness,and improved scalability compared to advanced baselines such asMulti-StrategyACO and Hierarchical ACO.The meta-adaptive feedback loop compensates for the performance degradation caused by privacy noise and prevents premature stagnation by triggering Levy flight exploration only when necessary.
基金supported in part by the National Natural Science Foundation of China(62522320,92267108,62173322)Liaoning Revitalization Talents Program(XLYC2403062)the Science and Technology Program of Liaoning Province(2023JH3/10200004,2022JH25/10100005)。
文摘The wireless cloud robotic system(WCRS),which fully integrates sensing,communication,computing,and control capabilities as an intelligent agent,is a promising way to achieve intelligent manufacturing due to easy deployment and flexible expansion.However,the high-precision control of WCRS requires deterministic wireless communication,which is always challenging in the complex and dynamic radio space.This paper employs the reconfigurable intelligent surface(RIS)to establish a novel RIS-assisted WCRS architecture,where the radio channel is controlled to achieve ultra-reliable,low-delay,and low-jitter communication for high-precision closed-loop motion control.However,control and communication are strongly coupled and should be co-optimized.Fully considering the constraints of control input threshold,control delay deadline,beam phase,antenna power,and information distortion,we establish a stability maximization problem to jointly optimize control input compensation,RIS phase shift,and beamforming.Herein,a new jitter-oriented system stability objective with respect to control error and communication jitter is defined and the closed-form expression of control delay deadline is derived based on the Jensen Inequality and Lyapunov-Krasovskii functional.Due to the time-varying and partial observability of the channel and robot states,we model the problem as a partially observable Markov decision process(POMDP).To solve this complex problem,we propose a multi-agent transfer reinforcement learning algorithm named LSTM-PPO-MATRL,where the LSTM-enhanced proximal policy optimization(PPO)is designed to approximate an optimal solution and the option-guided policy transfer learning is proposed to facilitate the learning process.By centralized training and decentralized execution,LSTM-PPO-MATRL is validated by extensive experiments on MuJoCo tasks for both low-mobility and high-mobility robotic control scenarios.The results demonstrate that LSTM-PPO-MATRL not only realizes high learning efficiency,but also supports low-delay,low-jitter communication for low error control,where 71.9%control accuracy improvement and 68.7%delay jitter reduction are achieved compared to the PPO-MADRL baseline.
基金supported by the Natural Science Foundation of China(NSFC No.62472099)Ji Hua Laboratory S&T Program(X250881UG250).
文摘With the increasing role of social media in information dissemination,effectively simulating and analyzing public event dynamics has become a key research focus.We present an interactive visual analysis system for simulating social media events using multi-agent models powered by large language models(LLMs).By modeling agents with diverse characteristics,the system explores how agents perceive information,adjust their emotions and stances,provide feedback,and influence the trajectory of events.The system integrates real-time interactive simulation with multi-perspective visualization,enabling users to investigate event trajectories and key influencing factors under varied configurations.Theoretical work standardizes agent attributes and interaction mechanisms,supporting realistic simulation of social media behaviors.Evaluation through indicators and case studies demonstrates the system’s effectiveness and adaptability,offering a novel tool for public event analysis across open social platforms.