To address the shortcomings of traditional Genetic Algorithm (GA) in multi-agent path planning, such as prolonged planning time, slow convergence, and solution instability, this paper proposes an Asynchronous Genetic ...To address the shortcomings of traditional Genetic Algorithm (GA) in multi-agent path planning, such as prolonged planning time, slow convergence, and solution instability, this paper proposes an Asynchronous Genetic Algorithm (AGA) to solve multi-agent path planning problems effectively. To enhance the real-time performance and computational efficiency of Multi-Agent Systems (MAS) in path planning, the AGA incorporates an Equal-Size Clustering Algorithm (ESCA) based on the K-means clustering method. The ESCA divides the primary task evenly into a series of subtasks, thereby reducing the gene length in the subsequent GA process. The algorithm then employs GA to solve each subtask sequentially. To evaluate the effectiveness of the proposed method, a simulation program was designed to perform path planning for 100 trajectories, and the results were compared with those of State-Of-The-Art (SOTA) methods. The simulation results demonstrate that, although the solutions provided by AGA are suboptimal, it exhibits significant advantages in terms of execution speed and solution stability compared to other algorithms.展开更多
In order to improve the performance of the attribute reduction algorithm to deal with the noisy and uncertain large data, a novel co-evolutionary cloud-based attribute ensemble multi-agent reduction(CCAEMR) algorith...In order to improve the performance of the attribute reduction algorithm to deal with the noisy and uncertain large data, a novel co-evolutionary cloud-based attribute ensemble multi-agent reduction(CCAEMR) algorithm is proposed.First, a co-evolutionary cloud framework is designed under the M apReduce mechanism to divide the entire population into different co-evolutionary subpopulations with a self-adaptive scale. Meanwhile, these subpopulations will share their rewards to accelerate attribute reduction implementation.Secondly, a multi-agent ensemble strategy of co-evolutionary elitist optimization is constructed to ensure that subpopulations can exploit any correlation and interdependency between interacting attribute subsets with reinforcing noise tolerance.Hence, these agents are kept within the stable elitist region to achieve the optimal profit. The experimental results show that the proposed CCAEMR algorithm has better efficiency and feasibility to solve large-scale and uncertain dataset problems with complex noise.展开更多
Aiming at the deficiency of conventional traffic control method, this paper proposes a new method based on multi-agent technology for traffic control. Different from many existing methods, this paper distinguishes tra...Aiming at the deficiency of conventional traffic control method, this paper proposes a new method based on multi-agent technology for traffic control. Different from many existing methods, this paper distinguishes traffic control on the basis of the agent technology from conventional traffic control method. The composition and structure of a multi-agent system (MAS) is first discussed. Then, the step-coordination strategies of intersection-agent, segment-agent, and area-agent are put forward. The advantages of the algorithm are demonstrated by a simulation study.展开更多
Aiming for the coordinated motion and cooperative control of multi-agents in a non-rectangular bounded space, a velocity consensus algorithm for the agents with double- integrator dynamics is presented. The traditiona...Aiming for the coordinated motion and cooperative control of multi-agents in a non-rectangular bounded space, a velocity consensus algorithm for the agents with double- integrator dynamics is presented. The traditional consensus algorithm for bounded space is only applicable to rectangular bouncing boundaries, not suitable for non-rectangular space. In order to extend the previous consensus algorithm to the non- rectangular space, the concept of mirrored velocity is introduced, which can convert the discontinuous real velocity to continuous mirrored velocity, and expand a bounded space into an infinite space. Using the consensus algorithm, it is found that the mirrored velocities of multi-agents asymptotically converge to the same values. Because each mirrored velocity points to a unique velocity in real space, it can be concluded that the real velocities of multi-agents also asymptotically converge. Finally, the effectiveness of the proposed consensus algorithm is examined by theoretical proof and numerical simulations. Moreover, an experiment is performed with the algorithm in a real multi-robot system successfully.展开更多
In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents’ behaviors. However, joint-action reinforcement learni...In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents’ behaviors. However, joint-action reinforcement learning algorithms suffer the slow convergence rate because of the enormous learning space produced by joint-action. In this article, a prediction-based reinforcement learning algorithm is presented for multi-agent cooperation tasks, which demands all agents to learn predicting the probabilities of actions that other agents may execute. A multi-robot cooperation experiment is run to test the efficacy of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation policy much faster than the primitive reinforcement learning algorithm.展开更多
The resource constrained project scheduling problem (RCPSP) and a decision-making model based on multi-agent systems (MAS) and general equilibrium marketing are proposed. An algorithm leading to the resource allocatio...The resource constrained project scheduling problem (RCPSP) and a decision-making model based on multi-agent systems (MAS) and general equilibrium marketing are proposed. An algorithm leading to the resource allocation decision involved in RCPSP has also been developed. And this algorithm can be used in the multi-project scheduling field as well.Finally, an illustration is given.展开更多
In this paper,we consider distributed convex optimization problems on multi-agent networks.We develop and analyze the distributed gradient method which allows each agent to compute its dynamic stepsize by utilizing th...In this paper,we consider distributed convex optimization problems on multi-agent networks.We develop and analyze the distributed gradient method which allows each agent to compute its dynamic stepsize by utilizing the time-varying estimate of the local function value at the global optimal solution.Our approach can be applied to both synchronous and asynchronous communication protocols.Specifically,we propose the distributed subgradient with uncoordinated dynamic stepsizes(DS-UD)algorithm for synchronous protocol and the AsynDGD algorithm for asynchronous protocol.Theoretical analysis shows that the proposed algorithms guarantee that all agents reach a consensus on the solution to the multi-agent optimization problem.Moreover,the proposed approach with dynamic stepsizes eliminates the requirement of diminishing stepsize in existing works.Numerical examples of distributed estimation in sensor networks are provided to illustrate the effectiveness of the proposed approach.展开更多
Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-...Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.展开更多
Traditionally, heuristic re-planning algorithms are used to tackle the problem of dynamic task planning for multiple satellites. However, the traditional heuristic strategies depend on the concrete tasks, which often ...Traditionally, heuristic re-planning algorithms are used to tackle the problem of dynamic task planning for multiple satellites. However, the traditional heuristic strategies depend on the concrete tasks, which often affect the result’s optimality. Noticing that the historical information of cooperative task planning will impact the latter planning results, we propose a hybrid learning algorithm for dynamic multi-satellite task planning, which is based on the multi-agent reinforcement learning of policy iteration and the transfer learning. The reinforcement learning strategy of each satellite is described with neural networks. The policy neural network individuals with the best topological structure and weights are found by applying co-evolutionary search iteratively. To avoid the failure of the historical learning caused by the randomly occurring observation requests, a novel approach is proposed to balance the quality and efficiency of the task planning, which converts the historical learning strategy to the current initial learning strategy by applying the transfer learning algorithm. The simulations and analysis show the feasibility and adaptability of the proposed approach especially for the situation with randomly occurring observation requests.展开更多
This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method...This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system.展开更多
To solve the dynamical consensus problem of second-order multi-agent systems with communication delay,delay-dependent compensations are added into the normal asynchronously-coupled consensus algorithm so as to make th...To solve the dynamical consensus problem of second-order multi-agent systems with communication delay,delay-dependent compensations are added into the normal asynchronously-coupled consensus algorithm so as to make the agents achieve a dynamical consensus. Based on frequency-domain analysis, sufficient conditions are gained for second-order multi-agent systems with communication delay under leaderless and leader-following consensus algorithms respectively. Simulation illustrates the correctness of the results.展开更多
This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary obj...This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary objective is to explore the unknown environments to locate and track targets effectively. To address this problem, we propose a novel Multi-Agent Reinforcement Learning (MARL) method based on Graph Neural Network (GNN). Firstly, a method is introduced for encoding continuous-space multi-UAV problem data into spatial graphs which establish essential relationships among agents, obstacles, and targets. Secondly, a Graph AttenTion network (GAT) model is presented, which focuses exclusively on adjacent nodes, learns attention weights adaptively and allows agents to better process information in dynamic environments. Reward functions are specifically designed to tackle exploration challenges in environments with sparse rewards. By introducing a framework that integrates centralized training and distributed execution, the advancement of models is facilitated. Simulation results show that the proposed method outperforms the existing MARL method in search rate and tracking performance with less collisions. The experiments show that the proposed method can be extended to applications with a larger number of agents, which provides a potential solution to the challenging problem of multi-UAV autonomous tracking in dynamic unknown environments.展开更多
In multi-agent confrontation scenarios, a jammer is constrained by the single limited performance and inefficiency of practical application. To cope with these issues, this paper aims to investigate the multi-agent ja...In multi-agent confrontation scenarios, a jammer is constrained by the single limited performance and inefficiency of practical application. To cope with these issues, this paper aims to investigate the multi-agent jamming problem in a multi-user scenario, where the coordination between the jammers is considered. Firstly, a multi-agent Markov decision process (MDP) framework is used to model and analyze the multi-agent jamming problem. Secondly, a collaborative multi-agent jamming algorithm (CMJA) based on reinforcement learning is proposed. Finally, an actual intelligent jamming system is designed and built based on software-defined radio (SDR) platform for simulation and platform verification. The simulation and platform verification results show that the proposed CMJA algorithm outperforms the independent Q-learning method and provides a better jamming effect.展开更多
In multi-agent systems, autonomous agents may form coalition to increase the efficiency of problem solving. But the current coalition algorithm is very complex, and cannot satisfy the condition of optimality and stabl...In multi-agent systems, autonomous agents may form coalition to increase the efficiency of problem solving. But the current coalition algorithm is very complex, and cannot satisfy the condition of optimality and stableness simultaneously. To solve the problem, an algorithm that uses the mechanism of distribution according to work for coalition formation is presented, which can achieve global optimal and stable solution in subadditive task oriented domains. The validity of the algorithm is demonstrated by both experiments and theory.展开更多
As optimization of parameters affects prediction accuracy and generalization ability of support vector regression(SVR) greatly and the predictive model often mismatches nonlinear system model predictive control,a mult...As optimization of parameters affects prediction accuracy and generalization ability of support vector regression(SVR) greatly and the predictive model often mismatches nonlinear system model predictive control,a multi-step model predictive control based on online SVR(OSVR) optimized by multi-agent particle swarm optimization algorithm(MAPSO) is put forward. By integrating the online learning ability of OSVR, the predictive model can self-correct and adapt to the dynamic changes in nonlinear process well.展开更多
In order to get a globally optimized solution for the Elevator Group Control System (EGCS) scheduling problem, an algorithm with an overall optimization function is needed. In this study, Real-time Particle Swarm Opti...In order to get a globally optimized solution for the Elevator Group Control System (EGCS) scheduling problem, an algorithm with an overall optimization function is needed. In this study, Real-time Particle Swarm Optimization (RPSO) is proposed to find an optimal solution to the EGCS scheduling problem. Different traffic patterns and controller mechanisms for EGCS are analyzed. This study focuses on up-peak traffic because of its critical importance to modern office buildings. Simulation results show that EGCS based on Multi-Agent Systems (MAS) using RPSO gives good results for up-peak EGCS scheduling problem. Besides, the elevator real-time scheduling and reallocation functions are realized based on RPSO in case new information is available or the elevator becomes busy because it is unavailable or full. This study contributes a new scheduling algorithm for EGCS, and expands the application of PSO.展开更多
This paper introduces a multi-agent system which i nt egrates process planning and production scheduling, in order to increase the fle xibility of manufacturing systems in coping with rapid changes in dynamic market a...This paper introduces a multi-agent system which i nt egrates process planning and production scheduling, in order to increase the fle xibility of manufacturing systems in coping with rapid changes in dynamic market and dealing with internal uncertainties such as machine breakdown or resources shortage. This system consists of various autonomous agents, each of which has t he capability of communicating with one another and making decisions based on it s knowledge and if necessary on information provided by other agents. Machine ag ents which represent the machines play an important role in the system in that t hey negotiate with each other to bid for jobs. An iterative bidding mechanism is proposed to facilitate the process of job assignment to machines and handle the negotiation between agents. This mechanism enables near optimal process plans a nd production schedules to be produced concurrently, so that dynamic changes in the market can be coped with at a minimum cost, and the utilisation of manufactu ring resources can be optimised. In addition, a currency scheme with currency-l ike metrics is proposed to encourage or prohibit machine agents to put forward t heir bids for the jobs announced. The values of the metrics are adjusted iterati vely so as to obtain an integrated plan and schedule which result in the minimum total production cost while satisfying products due dates. To deal with the optimisation problem, i.e. to what degree and how the currencies should be adj usted in each iteration, a genetic algorithm (GA) is developed. Comparisons are made between GA approach and simulated annealing (SA) optimisation technique.展开更多
Although quantum Bayesian networks provide a promising paradigm for multi-agent decision-making,their practical application faces two challenges in the noisy intermediate-scale quantum(NISQ)era.Limited qubit resources...Although quantum Bayesian networks provide a promising paradigm for multi-agent decision-making,their practical application faces two challenges in the noisy intermediate-scale quantum(NISQ)era.Limited qubit resources restrict direct application to large-scale inference tasks.Additionally,no quantum methods are currently available for multi-agent collaborative decision-making.To address these,we propose a hybrid quantum–classical multi-agent decision-making framework based on hierarchical Bayesian networks,comprising two novel methods.The first one is a hybrid quantum–classical inference method based on hierarchical Bayesian networks.It decomposes large-scale hierarchical Bayesian networks into modular subnetworks.The inference for each subnetwork can be performed on NISQ devices,and the intermediate results are converted into classical messages for cross-layer transmission.The second one is a multi-agent decision-making method using the variational quantum eigensolver(VQE)in the influence diagram.This method models the collaborative decision-making with the influence diagram and encodes the expected utility of diverse actions into a Hamiltonian and subsequently determines the intra-group optimal action efficiently.Experimental validation on the IonQ quantum simulator demonstrates that the hierarchical method outperforms the non-hierarchical method at the functional inference level,and the VQE method can obtain the optimal strategy exactly at the collaborative decision-making level.Our research not only extends the application of quantum computing to multi-agent decision-making but also provides a practical solution for the NISQ era.展开更多
To address the issue of premature convergence and slow convergence rate in three-dimensional (3D) route planning of unmanned aerial vehicle (UAV) low-altitude penetration,a novel route planning method was proposed.Fir...To address the issue of premature convergence and slow convergence rate in three-dimensional (3D) route planning of unmanned aerial vehicle (UAV) low-altitude penetration,a novel route planning method was proposed.First and foremost,a coevolutionary multi-agent genetic algorithm (CE-MAGA) was formed by introducing coevolutionary mechanism to multi-agent genetic algorithm (MAGA),an efficient global optimization algorithm.A dynamic route representation form was also adopted to improve the flight route accuracy.Moreover,an efficient constraint handling method was used to simplify the treatment of multi-constraint and reduce the time-cost of planning computation.Simulation and corresponding analysis show that the planning results of CE-MAGA have better performance on terrain following,terrain avoidance,threat avoidance (TF/TA2) and lower route costs than other existing algorithms.In addition,feasible flight routes can be acquired within 2 s,and the convergence rate of the whole evolutionary process is very fast.展开更多
文摘To address the shortcomings of traditional Genetic Algorithm (GA) in multi-agent path planning, such as prolonged planning time, slow convergence, and solution instability, this paper proposes an Asynchronous Genetic Algorithm (AGA) to solve multi-agent path planning problems effectively. To enhance the real-time performance and computational efficiency of Multi-Agent Systems (MAS) in path planning, the AGA incorporates an Equal-Size Clustering Algorithm (ESCA) based on the K-means clustering method. The ESCA divides the primary task evenly into a series of subtasks, thereby reducing the gene length in the subsequent GA process. The algorithm then employs GA to solve each subtask sequentially. To evaluate the effectiveness of the proposed method, a simulation program was designed to perform path planning for 100 trajectories, and the results were compared with those of State-Of-The-Art (SOTA) methods. The simulation results demonstrate that, although the solutions provided by AGA are suboptimal, it exhibits significant advantages in terms of execution speed and solution stability compared to other algorithms.
基金The National Natural Science Foundation of China(No.61300167)the Open Project Program of State Key Laboratory for Novel Software Technology of Nanjing University(No.KFKT2015B17)+3 种基金the Natural Science Foundation of Jiangsu Province(No.BK20151274)Qing Lan Project of Jiangsu Provincethe Open Project Program of Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education(No.JYB201606)the Program for Special Talent in Six Fields of Jiangsu Province(No.XYDXXJS-048)
文摘In order to improve the performance of the attribute reduction algorithm to deal with the noisy and uncertain large data, a novel co-evolutionary cloud-based attribute ensemble multi-agent reduction(CCAEMR) algorithm is proposed.First, a co-evolutionary cloud framework is designed under the M apReduce mechanism to divide the entire population into different co-evolutionary subpopulations with a self-adaptive scale. Meanwhile, these subpopulations will share their rewards to accelerate attribute reduction implementation.Secondly, a multi-agent ensemble strategy of co-evolutionary elitist optimization is constructed to ensure that subpopulations can exploit any correlation and interdependency between interacting attribute subsets with reinforcing noise tolerance.Hence, these agents are kept within the stable elitist region to achieve the optimal profit. The experimental results show that the proposed CCAEMR algorithm has better efficiency and feasibility to solve large-scale and uncertain dataset problems with complex noise.
文摘Aiming at the deficiency of conventional traffic control method, this paper proposes a new method based on multi-agent technology for traffic control. Different from many existing methods, this paper distinguishes traffic control on the basis of the agent technology from conventional traffic control method. The composition and structure of a multi-agent system (MAS) is first discussed. Then, the step-coordination strategies of intersection-agent, segment-agent, and area-agent are put forward. The advantages of the algorithm are demonstrated by a simulation study.
基金The National Natural Science Foundation of China(No.61273110)the Specialized Fund for the Doctoral Program of Higher Education(No.20130092130002)
文摘Aiming for the coordinated motion and cooperative control of multi-agents in a non-rectangular bounded space, a velocity consensus algorithm for the agents with double- integrator dynamics is presented. The traditional consensus algorithm for bounded space is only applicable to rectangular bouncing boundaries, not suitable for non-rectangular space. In order to extend the previous consensus algorithm to the non- rectangular space, the concept of mirrored velocity is introduced, which can convert the discontinuous real velocity to continuous mirrored velocity, and expand a bounded space into an infinite space. Using the consensus algorithm, it is found that the mirrored velocities of multi-agents asymptotically converge to the same values. Because each mirrored velocity points to a unique velocity in real space, it can be concluded that the real velocities of multi-agents also asymptotically converge. Finally, the effectiveness of the proposed consensus algorithm is examined by theoretical proof and numerical simulations. Moreover, an experiment is performed with the algorithm in a real multi-robot system successfully.
文摘In multi-agent systems, joint-action must be employed to achieve cooperation because the evaluation of the behavior of an agent often depends on the other agents’ behaviors. However, joint-action reinforcement learning algorithms suffer the slow convergence rate because of the enormous learning space produced by joint-action. In this article, a prediction-based reinforcement learning algorithm is presented for multi-agent cooperation tasks, which demands all agents to learn predicting the probabilities of actions that other agents may execute. A multi-robot cooperation experiment is run to test the efficacy of the new algorithm, and the experiment results show that the new algorithm can achieve the cooperation policy much faster than the primitive reinforcement learning algorithm.
文摘The resource constrained project scheduling problem (RCPSP) and a decision-making model based on multi-agent systems (MAS) and general equilibrium marketing are proposed. An algorithm leading to the resource allocation decision involved in RCPSP has also been developed. And this algorithm can be used in the multi-project scheduling field as well.Finally, an illustration is given.
基金supported by the Key Research and Development Project in Guangdong Province(2020B0101050001)the National Science Foundation of China(61973214,61590924,61963030)the Natural Science Foundation of Shanghai(19ZR1476200)。
文摘In this paper,we consider distributed convex optimization problems on multi-agent networks.We develop and analyze the distributed gradient method which allows each agent to compute its dynamic stepsize by utilizing the time-varying estimate of the local function value at the global optimal solution.Our approach can be applied to both synchronous and asynchronous communication protocols.Specifically,we propose the distributed subgradient with uncoordinated dynamic stepsizes(DS-UD)algorithm for synchronous protocol and the AsynDGD algorithm for asynchronous protocol.Theoretical analysis shows that the proposed algorithms guarantee that all agents reach a consensus on the solution to the multi-agent optimization problem.Moreover,the proposed approach with dynamic stepsizes eliminates the requirement of diminishing stepsize in existing works.Numerical examples of distributed estimation in sensor networks are provided to illustrate the effectiveness of the proposed approach.
基金The National Natural Science Foundation of China(62136008,62293541)The Beijing Natural Science Foundation(4232056)The Beijing Nova Program(20240484514).
文摘Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world.
文摘Traditionally, heuristic re-planning algorithms are used to tackle the problem of dynamic task planning for multiple satellites. However, the traditional heuristic strategies depend on the concrete tasks, which often affect the result’s optimality. Noticing that the historical information of cooperative task planning will impact the latter planning results, we propose a hybrid learning algorithm for dynamic multi-satellite task planning, which is based on the multi-agent reinforcement learning of policy iteration and the transfer learning. The reinforcement learning strategy of each satellite is described with neural networks. The policy neural network individuals with the best topological structure and weights are found by applying co-evolutionary search iteratively. To avoid the failure of the historical learning caused by the randomly occurring observation requests, a novel approach is proposed to balance the quality and efficiency of the task planning, which converts the historical learning strategy to the current initial learning strategy by applying the transfer learning algorithm. The simulations and analysis show the feasibility and adaptability of the proposed approach especially for the situation with randomly occurring observation requests.
基金The National Natural Science Foundation of China(W2431048)The Science and Technology Research Program of Chongqing Municipal Education Commission,China(KJZDK202300807)The Chongqing Natural Science Foundation,China(CSTB2024NSCQQCXMX0052).
文摘This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system.
基金Supported by the National Natural Science Foundation of China under Grant Nos.61104092,61134007,and61203147the Priority Academic Program Development of Jiangsu Higher Education Institutions
文摘To solve the dynamical consensus problem of second-order multi-agent systems with communication delay,delay-dependent compensations are added into the normal asynchronously-coupled consensus algorithm so as to make the agents achieve a dynamical consensus. Based on frequency-domain analysis, sufficient conditions are gained for second-order multi-agent systems with communication delay under leaderless and leader-following consensus algorithms respectively. Simulation illustrates the correctness of the results.
基金supported by the National Natural Science Foundation of China(Nos.12272104,U22B2013).
文摘This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary objective is to explore the unknown environments to locate and track targets effectively. To address this problem, we propose a novel Multi-Agent Reinforcement Learning (MARL) method based on Graph Neural Network (GNN). Firstly, a method is introduced for encoding continuous-space multi-UAV problem data into spatial graphs which establish essential relationships among agents, obstacles, and targets. Secondly, a Graph AttenTion network (GAT) model is presented, which focuses exclusively on adjacent nodes, learns attention weights adaptively and allows agents to better process information in dynamic environments. Reward functions are specifically designed to tackle exploration challenges in environments with sparse rewards. By introducing a framework that integrates centralized training and distributed execution, the advancement of models is facilitated. Simulation results show that the proposed method outperforms the existing MARL method in search rate and tracking performance with less collisions. The experiments show that the proposed method can be extended to applications with a larger number of agents, which provides a potential solution to the challenging problem of multi-UAV autonomous tracking in dynamic unknown environments.
基金supported by National Natural Science Foundation of China (No. 62071488 and No. 62061013)
文摘In multi-agent confrontation scenarios, a jammer is constrained by the single limited performance and inefficiency of practical application. To cope with these issues, this paper aims to investigate the multi-agent jamming problem in a multi-user scenario, where the coordination between the jammers is considered. Firstly, a multi-agent Markov decision process (MDP) framework is used to model and analyze the multi-agent jamming problem. Secondly, a collaborative multi-agent jamming algorithm (CMJA) based on reinforcement learning is proposed. Finally, an actual intelligent jamming system is designed and built based on software-defined radio (SDR) platform for simulation and platform verification. The simulation and platform verification results show that the proposed CMJA algorithm outperforms the independent Q-learning method and provides a better jamming effect.
文摘In multi-agent systems, autonomous agents may form coalition to increase the efficiency of problem solving. But the current coalition algorithm is very complex, and cannot satisfy the condition of optimality and stableness simultaneously. To solve the problem, an algorithm that uses the mechanism of distribution according to work for coalition formation is presented, which can achieve global optimal and stable solution in subadditive task oriented domains. The validity of the algorithm is demonstrated by both experiments and theory.
基金the National Natural Science Foundation of China(No.60905066)the Natural Science Foundation of Chongqing(No.cstc2018jcyjA0667)
文摘As optimization of parameters affects prediction accuracy and generalization ability of support vector regression(SVR) greatly and the predictive model often mismatches nonlinear system model predictive control,a multi-step model predictive control based on online SVR(OSVR) optimized by multi-agent particle swarm optimization algorithm(MAPSO) is put forward. By integrating the online learning ability of OSVR, the predictive model can self-correct and adapt to the dynamic changes in nonlinear process well.
文摘In order to get a globally optimized solution for the Elevator Group Control System (EGCS) scheduling problem, an algorithm with an overall optimization function is needed. In this study, Real-time Particle Swarm Optimization (RPSO) is proposed to find an optimal solution to the EGCS scheduling problem. Different traffic patterns and controller mechanisms for EGCS are analyzed. This study focuses on up-peak traffic because of its critical importance to modern office buildings. Simulation results show that EGCS based on Multi-Agent Systems (MAS) using RPSO gives good results for up-peak EGCS scheduling problem. Besides, the elevator real-time scheduling and reallocation functions are realized based on RPSO in case new information is available or the elevator becomes busy because it is unavailable or full. This study contributes a new scheduling algorithm for EGCS, and expands the application of PSO.
文摘This paper introduces a multi-agent system which i nt egrates process planning and production scheduling, in order to increase the fle xibility of manufacturing systems in coping with rapid changes in dynamic market and dealing with internal uncertainties such as machine breakdown or resources shortage. This system consists of various autonomous agents, each of which has t he capability of communicating with one another and making decisions based on it s knowledge and if necessary on information provided by other agents. Machine ag ents which represent the machines play an important role in the system in that t hey negotiate with each other to bid for jobs. An iterative bidding mechanism is proposed to facilitate the process of job assignment to machines and handle the negotiation between agents. This mechanism enables near optimal process plans a nd production schedules to be produced concurrently, so that dynamic changes in the market can be coped with at a minimum cost, and the utilisation of manufactu ring resources can be optimised. In addition, a currency scheme with currency-l ike metrics is proposed to encourage or prohibit machine agents to put forward t heir bids for the jobs announced. The values of the metrics are adjusted iterati vely so as to obtain an integrated plan and schedule which result in the minimum total production cost while satisfying products due dates. To deal with the optimisation problem, i.e. to what degree and how the currencies should be adj usted in each iteration, a genetic algorithm (GA) is developed. Comparisons are made between GA approach and simulated annealing (SA) optimisation technique.
基金supported by the National Natural Science Foundation of China(Grant Nos.62473371 and 61673389)。
文摘Although quantum Bayesian networks provide a promising paradigm for multi-agent decision-making,their practical application faces two challenges in the noisy intermediate-scale quantum(NISQ)era.Limited qubit resources restrict direct application to large-scale inference tasks.Additionally,no quantum methods are currently available for multi-agent collaborative decision-making.To address these,we propose a hybrid quantum–classical multi-agent decision-making framework based on hierarchical Bayesian networks,comprising two novel methods.The first one is a hybrid quantum–classical inference method based on hierarchical Bayesian networks.It decomposes large-scale hierarchical Bayesian networks into modular subnetworks.The inference for each subnetwork can be performed on NISQ devices,and the intermediate results are converted into classical messages for cross-layer transmission.The second one is a multi-agent decision-making method using the variational quantum eigensolver(VQE)in the influence diagram.This method models the collaborative decision-making with the influence diagram and encodes the expected utility of diverse actions into a Hamiltonian and subsequently determines the intra-group optimal action efficiently.Experimental validation on the IonQ quantum simulator demonstrates that the hierarchical method outperforms the non-hierarchical method at the functional inference level,and the VQE method can obtain the optimal strategy exactly at the collaborative decision-making level.Our research not only extends the application of quantum computing to multi-agent decision-making but also provides a practical solution for the NISQ era.
基金Project(60925011) supported by the National Natural Science Foundation for Distinguished Young Scholars of ChinaProject(9140A06040510BQXXXX) supported by Advanced Research Foundation of General Armament Department,China
文摘To address the issue of premature convergence and slow convergence rate in three-dimensional (3D) route planning of unmanned aerial vehicle (UAV) low-altitude penetration,a novel route planning method was proposed.First and foremost,a coevolutionary multi-agent genetic algorithm (CE-MAGA) was formed by introducing coevolutionary mechanism to multi-agent genetic algorithm (MAGA),an efficient global optimization algorithm.A dynamic route representation form was also adopted to improve the flight route accuracy.Moreover,an efficient constraint handling method was used to simplify the treatment of multi-constraint and reduce the time-cost of planning computation.Simulation and corresponding analysis show that the planning results of CE-MAGA have better performance on terrain following,terrain avoidance,threat avoidance (TF/TA2) and lower route costs than other existing algorithms.In addition,feasible flight routes can be acquired within 2 s,and the convergence rate of the whole evolutionary process is very fast.