In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Mu...In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education.展开更多
Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making p...Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making problems,significantly enhancing swarm intelligence in maneuvering.However,applying MARL to unmanned swarms presents two primary challenges.First,defensive agents must balance autonomy with collaboration under limited perception while coordinating against adversaries.Second,current algorithms aim to maximize global or individual rewards,making them sensitive to fluctuations in enemy strategies and environmental changes,especially when rewards are sparse.To tackle these issues,we propose an algorithm of MultiAgent Reinforcement Learning with Layered Autonomy and Collaboration(MARL-LAC)for collaborative confrontations.This algorithm integrates dual twin Critics to mitigate the high variance associated with policy gradients.Furthermore,MARL-LAC employs layered autonomy and collaboration to address multi-objective problems,specifically learning a global reward function for the swarm alongside local reward functions for individual defensive agents.Experimental results demonstrate that MARL-LAC enhances decision-making and collaborative behaviors among agents,outperforming the existing algorithms and emphasizing the importance of layered autonomy and collaboration in multi-agent systems.The observed adversarial behaviors demonstrate that agents using MARL-LAC effectively maintain cohesive formations that conceal their intentions by confusing the offensive agent while successfully encircling the target.展开更多
Dear Editor,This letter investigates predefined-time optimization problems(OPs) of multi-agent systems(MASs), where the agent of MASs is subject to inequality constraints, and the team objective function accounts for ...Dear Editor,This letter investigates predefined-time optimization problems(OPs) of multi-agent systems(MASs), where the agent of MASs is subject to inequality constraints, and the team objective function accounts for impulse effects. Firstly, to address the inequality constraints,the penalty method is introduced. Then, a novel optimization strategy is developed, which only requires that the team objective function be strongly convex.展开更多
Dear Editor,This letter studies a real-world issue in leader-follower multi-agent systems(MASs)named open topology,which permits the variations of agent set and network connections.Specially,a novel transition process...Dear Editor,This letter studies a real-world issue in leader-follower multi-agent systems(MASs)named open topology,which permits the variations of agent set and network connections.Specially,a novel transition process is developed to explain how the involved variation of network scale affects the dynamic behavior of the MASs.From a resource limited perspective,the distributed saturated impulsive control is then designed,under which some sufficient criteria are integrated into local quasi-consensus performance.We also provide a combined optimization algorithm for all agents to make the estimated domain of initial errors closer to the real one,thereby resulting in less conservativeness.Finally,a numerical example validates our results.展开更多
Density-functional-theory(DFT)simulations with the Vienna Ab initio Simulation Package(VASP)are indispensable in computational materials science but often require extensive manual setup,monitoring,and postprocessing.H...Density-functional-theory(DFT)simulations with the Vienna Ab initio Simulation Package(VASP)are indispensable in computational materials science but often require extensive manual setup,monitoring,and postprocessing.Here,we introduce VASPilot,an open-source platform that fully automates VASP workflows via a multi-agent architecture built on the CrewAI framework and a standardized model context protocol(MCP).VASPilot’s agent suite handles every stage of a VASP study from retrieving crystal structures and generating input files to submitting Slurm jobs,parsing error messages,and dynamically adjusting parameters for seamless restarts.A lightweight Quart-based web interface provides intuitive task submission,real-time progress tracking,and drill-down access to execution logs,structure visualizations,and plots.We validated VASPilot on both routine and advanced benchmarks:automated band-structure and density-of-states calculations(including on-the-fly symmetry corrections),plane-wave cutoff convergence tests,lattice-constant optimizations with various van der Waals corrections,and cross-material band-gap comparisons for transition-metal dichalcogenides.In all cases,VASPilot completed the missions reliably and without manual intervention.Moreover,its modular design allows easy extension to other DFT codes simply by deploying the appropriate MCP server.By offloading technical overhead,VASPilot enables researchers to focus on scientific discovery and accelerates high-throughput computational materials research.展开更多
Multi-agent systems often require good interoperability in the process of completing their assigned tasks.This paper first models the static structure and dynamic behavior of multiagent systems based on layered weight...Multi-agent systems often require good interoperability in the process of completing their assigned tasks.This paper first models the static structure and dynamic behavior of multiagent systems based on layered weighted scale-free community network and susceptible-infected-recovered(SIR)model.To solve the problem of difficulty in describing the changes in the structure and collaboration mode of the system under external factors,a two-dimensional Monte Carlo method and an improved dynamic Bayesian network are used to simulate the impact of external environmental factors on multi-agent systems.A collaborative information flow path optimization algorithm for agents under environmental factors is designed based on the Dijkstra algorithm.A method for evaluating system interoperability is designed based on simulation experiments,providing reference for the construction planning and optimization of organizational application of the system.Finally,the feasibility of the method is verified through case studies.展开更多
文摘In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education.
基金co-supported by the National Natural Science Foundation of China(Nos.72371052 and 71871042).
文摘Addressing optimal confrontation methods in multi-agent attack-defense scenarios is a complex challenge.Multi-Agent Reinforcement Learning(MARL)provides an effective framework for tackling sequential decision-making problems,significantly enhancing swarm intelligence in maneuvering.However,applying MARL to unmanned swarms presents two primary challenges.First,defensive agents must balance autonomy with collaboration under limited perception while coordinating against adversaries.Second,current algorithms aim to maximize global or individual rewards,making them sensitive to fluctuations in enemy strategies and environmental changes,especially when rewards are sparse.To tackle these issues,we propose an algorithm of MultiAgent Reinforcement Learning with Layered Autonomy and Collaboration(MARL-LAC)for collaborative confrontations.This algorithm integrates dual twin Critics to mitigate the high variance associated with policy gradients.Furthermore,MARL-LAC employs layered autonomy and collaboration to address multi-objective problems,specifically learning a global reward function for the swarm alongside local reward functions for individual defensive agents.Experimental results demonstrate that MARL-LAC enhances decision-making and collaborative behaviors among agents,outperforming the existing algorithms and emphasizing the importance of layered autonomy and collaboration in multi-agent systems.The observed adversarial behaviors demonstrate that agents using MARL-LAC effectively maintain cohesive formations that conceal their intentions by confusing the offensive agent while successfully encircling the target.
基金supported in part by the National Natural Science Foundation of China(62276119)the Natural Science Foundation of Jiangsu Province(BK20241764)the Postgraduate Research & Practice Innovation Program of Jiangsu Province(KYCX22_2860)
文摘Dear Editor,This letter investigates predefined-time optimization problems(OPs) of multi-agent systems(MASs), where the agent of MASs is subject to inequality constraints, and the team objective function accounts for impulse effects. Firstly, to address the inequality constraints,the penalty method is introduced. Then, a novel optimization strategy is developed, which only requires that the team objective function be strongly convex.
基金supported by the Natural Science Foundation of Jiangsu Province(BK20240009)the National Natural Science Foundation of China(62373105,62373262)Jiangsu Provincial Scientific Research Center of Applied Mathematics(BK20233002).
文摘Dear Editor,This letter studies a real-world issue in leader-follower multi-agent systems(MASs)named open topology,which permits the variations of agent set and network connections.Specially,a novel transition process is developed to explain how the involved variation of network scale affects the dynamic behavior of the MASs.From a resource limited perspective,the distributed saturated impulsive control is then designed,under which some sufficient criteria are integrated into local quasi-consensus performance.We also provide a combined optimization algorithm for all agents to make the estimated domain of initial errors closer to the real one,thereby resulting in less conservativeness.Finally,a numerical example validates our results.
基金supported by the Science Center of the National Natural Science Foundation of China(Grant No.12188101)the National Key R&D Program of China(Grant Nos.2023YFA1607400 and 2022YFA1403800)+2 种基金the National Natural Science Foundation of China(Grant Nos.12274436,11925408,and 11921004)the New Cornerstone Science Foundation through the XPLORER PRIZEperformed on the robotic AI-Scientist platform of the Chinese Academy of Science.
文摘Density-functional-theory(DFT)simulations with the Vienna Ab initio Simulation Package(VASP)are indispensable in computational materials science but often require extensive manual setup,monitoring,and postprocessing.Here,we introduce VASPilot,an open-source platform that fully automates VASP workflows via a multi-agent architecture built on the CrewAI framework and a standardized model context protocol(MCP).VASPilot’s agent suite handles every stage of a VASP study from retrieving crystal structures and generating input files to submitting Slurm jobs,parsing error messages,and dynamically adjusting parameters for seamless restarts.A lightweight Quart-based web interface provides intuitive task submission,real-time progress tracking,and drill-down access to execution logs,structure visualizations,and plots.We validated VASPilot on both routine and advanced benchmarks:automated band-structure and density-of-states calculations(including on-the-fly symmetry corrections),plane-wave cutoff convergence tests,lattice-constant optimizations with various van der Waals corrections,and cross-material band-gap comparisons for transition-metal dichalcogenides.In all cases,VASPilot completed the missions reliably and without manual intervention.Moreover,its modular design allows easy extension to other DFT codes simply by deploying the appropriate MCP server.By offloading technical overhead,VASPilot enables researchers to focus on scientific discovery and accelerates high-throughput computational materials research.
基金supported by the Key R&D Projects in Jiangsu Province(BE2021729)the Key Primary Research Project of Primary Strengthening Program(KYZYJKKCJC23001).
文摘Multi-agent systems often require good interoperability in the process of completing their assigned tasks.This paper first models the static structure and dynamic behavior of multiagent systems based on layered weighted scale-free community network and susceptible-infected-recovered(SIR)model.To solve the problem of difficulty in describing the changes in the structure and collaboration mode of the system under external factors,a two-dimensional Monte Carlo method and an improved dynamic Bayesian network are used to simulate the impact of external environmental factors on multi-agent systems.A collaborative information flow path optimization algorithm for agents under environmental factors is designed based on the Dijkstra algorithm.A method for evaluating system interoperability is designed based on simulation experiments,providing reference for the construction planning and optimization of organizational application of the system.Finally,the feasibility of the method is verified through case studies.