期刊文献+
共找到58,061篇文章
< 1 2 250 >
每页显示 20 50 100
Multi-Agent Reinforcement Learning for Moving Target Defense Temporal Decision-Making Approach Based on Stackelberg-FlipIt Games
1
作者 Rongbo Sun Jinlong Fei +1 位作者 Yuefei Zhu Zhongyu Guo 《Computers, Materials & Continua》 2025年第8期3765-3786,共22页
Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,... Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,the issue of optimal defense timing remains underexplored.Current default approaches—periodic or overly frequent MTD triggers—lead to suboptimal trade-offs among system security,performance,and cost.The timing of MTD strategy activation critically impacts both defensive efficacy and operational overhead,yet existing frameworks inadequately address this temporal dimension.To bridge this gap,this paper proposes a Stackelberg-FlipIt game model that formalizes asymmetric cyber conflicts as alternating control over attack surfaces,thereby capturing the dynamic security state evolution of MTD systems.We introduce a belief factor to quantify information asymmetry during adversarial interactions,enhancing the precision of MTD trigger timing.Leveraging this game-theoretic foundation,we employMulti-Agent Reinforcement Learning(MARL)to derive adaptive temporal strategies,optimized via a novel four-dimensional reward function that holistically balances security,performance,cost,and timing.Experimental validation using IP addressmutation against scanning attacks demonstrates stable strategy convergence and accelerated defense response,significantly improving cybersecurity affordability and effectiveness. 展开更多
关键词 Cyber security moving target defense multi-agent reinforcement learning security metrics game theory
在线阅读 下载PDF
An Overview of Distributed Nash Equilibrium Seeking in Noncooperative Games for Multi-agent Systems:A Dynamic Control-Based Perspective
2
作者 Guanghui WEN Xiao FANG Meng LUAN 《Artificial Intelligence Science and Engineering》 2025年第4期239-254,共16页
This paper presents a comprehensive overview of distributed Nash equilibrium(NE)seeking algorithms in non-cooperative games for multiagent systems(MASs),with a distinct emphasis on the dynamic control perspective.It s... This paper presents a comprehensive overview of distributed Nash equilibrium(NE)seeking algorithms in non-cooperative games for multiagent systems(MASs),with a distinct emphasis on the dynamic control perspective.It specifically focuses on the research addressing distributed NE seeking problems in which agents are governed by heterogeneous dynamics.The paper begins by introducing fundamental concepts of general non-cooperative games and the NE,along with definitions of specific game structures such as aggregative games and multi-cluster games.It then systematically reviews existing studies on distributed NE seeking for various classes of MASs from the viewpoint of agent dynamics,including first-order,second-order,high-order,linear,and Euler-Lagrange(EL)systems.Furthermore,the paper highlights practical applications of these theoretical advances in cooperative control scenarios involving autonomous systems with complex dynamics,such as autonomous surface vessels,autonomous aerial vehicles,and other autonomous vehicles.Finally,the paper outlines several promising directions for future research. 展开更多
关键词 non-cooperative game aggregative game multi-cluster game Nash equilibrium dynamic control autonomous system
在线阅读 下载PDF
A Dynamic Deceptive Defense Framework for Zero-Day Attacks in IIoT:Integrating Stackelberg Game and Multi-Agent Distributed Deep Deterministic Policy Gradient
3
作者 Shigen Shen Xiaojun Ji Yimeng Liu 《Computers, Materials & Continua》 2025年第11期3997-4021,共25页
The Industrial Internet of Things(IIoT)is increasingly vulnerable to sophisticated cyber threats,particularly zero-day attacks that exploit unknown vulnerabilities and evade traditional security measures.To address th... The Industrial Internet of Things(IIoT)is increasingly vulnerable to sophisticated cyber threats,particularly zero-day attacks that exploit unknown vulnerabilities and evade traditional security measures.To address this critical challenge,this paper proposes a dynamic defense framework named Zero-day-aware Stackelberg Game-based Multi-Agent Distributed Deep Deterministic Policy Gradient(ZSG-MAD3PG).The framework integrates Stackelberg game modeling with the Multi-Agent Distributed Deep Deterministic Policy Gradient(MAD3PG)algorithm and incorporates defensive deception(DD)strategies to achieve adaptive and efficient protection.While conventional methods typically incur considerable resource overhead and exhibit higher latency due to static or rigid defensive mechanisms,the proposed ZSG-MAD3PG framework mitigates these limitations through multi-stage game modeling and adaptive learning,enabling more efficient resource utilization and faster response times.The Stackelberg-based architecture allows defenders to dynamically optimize packet sampling strategies,while attackers adjust their tactics to reach rapid equilibrium.Furthermore,dynamic deception techniques reduce the time required for the concealment of attacks and the overall system burden.A lightweight behavioral fingerprinting detection mechanism further enhances real-time zero-day attack identification within industrial device clusters.ZSG-MAD3PG demonstrates higher true positive rates(TPR)and lower false alarm rates(FAR)compared to existing methods,while also achieving improved latency,resource efficiency,and stealth adaptability in IIoT zero-day defense scenarios. 展开更多
关键词 Industrial internet of things zero-day attacks Stackelberg game distributed deep deterministic policy gradient defensive spoofing dynamic defense
在线阅读 下载PDF
MANUFACTURING SYSTEM SCHEDULING BASED ON MULTI-AGENT COOPERATION GAME 被引量:1
4
作者 刘建国 张小锋 王宁生 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2007年第4期329-334,共6页
Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduli... Aiming at the flexible manufacturing system with multi-machining and multi-assembly equipment, a new scheduling algorithm is proposed to decompose the assembly structure of the products, thus obtaining simple scheduling problems and forming the cOrrespOnding agents. Then, the importance and the restriction of each agent are cOnsidered, to obtain an order of simple scheduling problems based on the cooperation game theory. With this order, the scheduling of sub-questions is implemented in term of rules, and the almost optimal scheduling results for meeting the restriction can be obtained. Experimental results verify the effectiveness of the proposed scheduling algorithm. 展开更多
关键词 manufacturing scheduling cooperation game AGENT
在线阅读 下载PDF
“大数据、大模型、大计算”全新范式与舆情精准研判:理论和Multi-Agent实证两个向度的探索 被引量:1
5
作者 丁晓蔚 戚庆燕 刘梓航 《传媒观察》 2025年第2期28-42,共15页
本文探讨了“大数据、大模型、大计算”全新范式在舆情精准研判中的相关理论和应用实证。理论部分论述了该范式的概念和所涉关系,分析了其与Multi-Agent多智能体系统之间的联系。实证部分基于此范式在舆情研判中的应用案例,提出Multi-Ag... 本文探讨了“大数据、大模型、大计算”全新范式在舆情精准研判中的相关理论和应用实证。理论部分论述了该范式的概念和所涉关系,分析了其与Multi-Agent多智能体系统之间的联系。实证部分基于此范式在舆情研判中的应用案例,提出Multi-Agent多智能体协作驱动的舆情分析框架,构建全新的舆情研判流程,能有效应对动态变化的舆情环境。采用Multi-Agent对热点事件是否上热搜进行预测和检验,并与传统大模型和BERT模型进行对比分析。研究表明:Multi-Agent在应对涉及公众情感共鸣和社会性广泛事件时具有显著优势,能通过多角度的综合评估提升预测精度和鲁棒性。通过实证研究验证了Multi-Agent在舆情监测中的重要价值,为未来舆情精准研判提供了新的技术路径。 展开更多
关键词 “大数据、大模型、大计算”全新范式 multi-agent多智能体系统 舆情精准研判
原文传递
Multi-agent system application in accordance with game theory in bi-directional coordination network model 被引量:3
6
作者 ZHANG Jie WANG Gang +3 位作者 YUE Shaohua SONG Yafei LIU Jiayi YAO Xiaoqiang 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2020年第2期279-289,共11页
The multi-agent system is the optimal solution to complex intelligent problems. In accordance with the game theory, the concept of loyalty is introduced to analyze the relationship between agents' individual incom... The multi-agent system is the optimal solution to complex intelligent problems. In accordance with the game theory, the concept of loyalty is introduced to analyze the relationship between agents' individual income and global benefits and build the logical architecture of the multi-agent system. Besides, to verify the feasibility of the method, the cyclic neural network is optimized, the bi-directional coordination network is built as the training network for deep learning, and specific training scenes are simulated as the training background. After a certain number of training iterations, the model can learn simple strategies autonomously. Also,as the training time increases, the complexity of learning strategies rises gradually. Strategies such as obstacle avoidance, firepower distribution and collaborative cover are adopted to demonstrate the achievability of the model. The model is verified to be realizable by the examples of obstacle avoidance, fire distribution and cooperative cover. Under the same resource background, the model exhibits better convergence than other deep learning training networks, and it is not easy to fall into the local endless loop.Furthermore, the ability of the learning strategy is stronger than that of the training model based on rules, which is of great practical values. 展开更多
关键词 LOYALTY game theory bi-directional COORDINATION network multi-agent system learning STRATEGY
在线阅读 下载PDF
A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning 被引量:4
7
作者 MA Ye CHANG Tianqing FAN Wenhui 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2021年第3期642-657,共16页
In the evolutionary game of the same task for groups,the changes in game rules,personal interests,the crowd size,and external supervision cause uncertain effects on individual decision-making and game results.In the M... In the evolutionary game of the same task for groups,the changes in game rules,personal interests,the crowd size,and external supervision cause uncertain effects on individual decision-making and game results.In the Markov decision framework,a single-task multi-decision evolutionary game model based on multi-agent reinforcement learning is proposed to explore the evolutionary rules in the process of a game.The model can improve the result of a evolutionary game and facilitate the completion of the task.First,based on the multi-agent theory,to solve the existing problems in the original model,a negative feedback tax penalty mechanism is proposed to guide the strategy selection of individuals in the group.In addition,in order to evaluate the evolutionary game results of the group in the model,a calculation method of the group intelligence level is defined.Secondly,the Q-learning algorithm is used to improve the guiding effect of the negative feedback tax penalty mechanism.In the model,the selection strategy of the Q-learning algorithm is improved and a bounded rationality evolutionary game strategy is proposed based on the rule of evolutionary games and the consideration of the bounded rationality of individuals.Finally,simulation results show that the proposed model can effectively guide individuals to choose cooperation strategies which are beneficial to task completion and stability under different negative feedback factor values and different group sizes,so as to improve the group intelligence level. 展开更多
关键词 multi-agent reinforcement learning evolutionary game Q-LEARNING
在线阅读 下载PDF
A Novel Distributed Optimal Adaptive Control Algorithm for Nonlinear Multi-Agent Differential Graphical Games 被引量:7
8
作者 Majid Mazouchi Mohammad Bagher Naghibi-Sistani Seyed Kamal Hosseini Sani 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第1期331-341,共11页
In this paper, an online optimal distributed learning algorithm is proposed to solve leader-synchronization problem of nonlinear multi-agent differential graphical games. Each player approximates its optimal control p... In this paper, an online optimal distributed learning algorithm is proposed to solve leader-synchronization problem of nonlinear multi-agent differential graphical games. Each player approximates its optimal control policy using a single-network approximate dynamic programming(ADP) where only one critic neural network(NN) is employed instead of typical actorcritic structure composed of two NNs. The proposed distributed weight tuning laws for critic NNs guarantee stability in the sense of uniform ultimate boundedness(UUB) and convergence of control policies to the Nash equilibrium. In this paper, by introducing novel distributed local operators in weight tuning laws, there is no more requirement for initial stabilizing control policies. Furthermore, the overall closed-loop system stability is guaranteed by Lyapunov stability analysis. Finally, Simulation results show the effectiveness of the proposed algorithm. 展开更多
关键词 Approximate dynamic programming(ADP) distributed control neural networks(NNs) nonlinear differentia graphical games optimal control
在线阅读 下载PDF
Cooperative and Competitive Multi-Agent Systems:From Optimization to Games 被引量:16
9
作者 Jianrui Wang Yitian Hong +4 位作者 Jiali Wang Jiapeng Xu Yang Tang Qing-Long Han Jürgen Kurths 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第5期763-783,共21页
Multi-agent systems can solve scientific issues related to complex systems that are difficult or impossible for a single agent to solve through mutual collaboration and cooperation optimization.In a multi-agent system... Multi-agent systems can solve scientific issues related to complex systems that are difficult or impossible for a single agent to solve through mutual collaboration and cooperation optimization.In a multi-agent system,agents with a certain degree of autonomy generate complex interactions due to the correlation and coordination,which is manifested as cooperative/competitive behavior.This survey focuses on multi-agent cooperative optimization and cooperative/non-cooperative games.Starting from cooperative optimization,the studies on distributed optimization and federated optimization are summarized.The survey mainly focuses on distributed online optimization and its application in privacy protection,and overviews federated optimization from the perspective of privacy protection me-chanisms.Then,cooperative games and non-cooperative games are introduced to expand the cooperative optimization problems from two aspects of minimizing global costs and minimizing individual costs,respectively.Multi-agent cooperative and non-cooperative behaviors are modeled by games from both static and dynamic aspects,according to whether each player can make decisions based on the information of other players.Finally,future directions for cooperative optimization,cooperative/non-cooperative games,and their applications are discussed. 展开更多
关键词 Cooperative games counterfactual regret minimization distributed optimization federated optimization fictitious
在线阅读 下载PDF
Research on Maneuver Decision-Making of Multi-Agent Adversarial Game in a Random Interference Environment 被引量:1
10
作者 Shiguang Hu Le Ru +4 位作者 Bo Lu Zhenhua Wang Xiaolin Zhao Wenfei Wang Hailong Xi 《Computers, Materials & Continua》 SCIE EI 2024年第10期1879-1903,共25页
The strategy evolution process of game players is highly uncertain due to random emergent situations and other external disturbances.This paper investigates the issue of strategy interaction and behavioral decision-ma... The strategy evolution process of game players is highly uncertain due to random emergent situations and other external disturbances.This paper investigates the issue of strategy interaction and behavioral decision-making among game players in simulated confrontation scenarios within a random interference environment.It considers the possible risks that random disturbances may pose to the autonomous decision-making of game players,as well as the impact of participants’manipulative behaviors on the state changes of the players.A nonlinear mathematical model is established to describe the strategy decision-making process of the participants in this scenario.Subsequently,the strategy selection interaction relationship,strategy evolution stability,and dynamic decision-making process of the game players are investigated and verified by simulation experiments.The results show that maneuver-related parameters and random environmental interference factors have different effects on the selection and evolutionary speed of the agent’s strategies.Especially in a highly uncertain environment,even small information asymmetry or miscalculation may have a significant impact on decision-making.This also confirms the feasibility and effectiveness of the method proposed in the paper,which can better explain the behavioral decision-making process of the agent in the interaction process.This study provides feasibility analysis ideas and theoretical references for improving multi-agent interactive decision-making and the interpretability of the game system model. 展开更多
关键词 Behavior decision-making stochastic evolutionary game nonlinear mathematical modeling multi-agent MANEUVER
在线阅读 下载PDF
A Survey of Cooperative Multi-agent Reinforcement Learning for Multi-task Scenarios 被引量:1
11
作者 Jiajun CHAI Zijie ZHAO +1 位作者 Yuanheng ZHU Dongbin ZHAO 《Artificial Intelligence Science and Engineering》 2025年第2期98-121,共24页
Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-... Cooperative multi-agent reinforcement learning(MARL)is a key technology for enabling cooperation in complex multi-agent systems.It has achieved remarkable progress in areas such as gaming,autonomous driving,and multi-robot control.Empowering cooperative MARL with multi-task decision-making capabilities is expected to further broaden its application scope.In multi-task scenarios,cooperative MARL algorithms need to address 3 types of multi-task problems:reward-related multi-task,arising from different reward functions;multi-domain multi-task,caused by differences in state and action spaces,state transition functions;and scalability-related multi-task,resulting from the dynamic variation in the number of agents.Most existing studies focus on scalability-related multitask problems.However,with the increasing integration between large language models(LLMs)and multi-agent systems,a growing number of LLM-based multi-agent systems have emerged,enabling more complex multi-task cooperation.This paper provides a comprehensive review of the latest advances in this field.By combining multi-task reinforcement learning with cooperative MARL,we categorize and analyze the 3 major types of multi-task problems under multi-agent settings,offering more fine-grained classifications and summarizing key insights for each.In addition,we summarize commonly used benchmarks and discuss future directions of research in this area,which hold promise for further enhancing the multi-task cooperation capabilities of multi-agent systems and expanding their practical applications in the real world. 展开更多
关键词 MULTI-TASK multi-agent reinforcement learning large language models
在线阅读 下载PDF
Cooperative multi-agent game based on reinforcement learning 被引量:2
12
作者 Hongbo Liu 《High-Confidence Computing》 EI 2024年第1期70-80,共11页
Multi-agent reinforcement learning holds tremendous potential for revolutionizing intelligent systems across diverse domains.However,it is also concomitant with a set of formidable challenges,which include the effecti... Multi-agent reinforcement learning holds tremendous potential for revolutionizing intelligent systems across diverse domains.However,it is also concomitant with a set of formidable challenges,which include the effective allocation of credit values to each agent,real-time collaboration among heterogeneous agents,and an appropriate reward function to guide agent behavior.To handle these issues,we propose an innovative solution named the Graph Attention Counterfactual Multiagent Actor–Critic algorithm(GACMAC).This algorithm encompasses several key components:First,it employs a multiagent actor–critic framework along with counterfactual baselines to assess the individual actions of each agent.Second,it integrates a graph attention network to enhance real-time collaboration among agents,enabling heterogeneous agents to effectively share information during handling tasks.Third,it incorporates prior human knowledge through a potential-based reward shaping method,thereby elevating the convergence speed and stability of the algorithm.We tested our algorithm on the StarCraft Multi-Agent Challenge(SMAC)platform,which is a recognized platform for testing multiagent algorithms,and our algorithm achieved a win rate of over 95%on the platform,comparable to the current state-of-the-art multi-agent controllers. 展开更多
关键词 Collaborative multi-agent Reinforcement learning Credit distribution multi-agent communication Reward shaping
在线阅读 下载PDF
Improved Event-Triggered Adaptive Neural Network Control for Multi-agent Systems Under Denial-of-Service Attacks 被引量:1
13
作者 Huiyan ZHANG Yu HUANG +1 位作者 Ning ZHAO Peng SHI 《Artificial Intelligence Science and Engineering》 2025年第2期122-133,共12页
This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method... This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system. 展开更多
关键词 multi-agent systems neural network DoS attacks memory-based adaptive event-triggered mechanism
在线阅读 下载PDF
Graph-based multi-agent reinforcement learning for collaborative search and tracking of multiple UAVs 被引量:2
14
作者 Bocheng ZHAO Mingying HUO +4 位作者 Zheng LI Wenyu FENG Ze YU Naiming QI Shaohai WANG 《Chinese Journal of Aeronautics》 2025年第3期109-123,共15页
This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary obj... This paper investigates the challenges associated with Unmanned Aerial Vehicle (UAV) collaborative search and target tracking in dynamic and unknown environments characterized by limited field of view. The primary objective is to explore the unknown environments to locate and track targets effectively. To address this problem, we propose a novel Multi-Agent Reinforcement Learning (MARL) method based on Graph Neural Network (GNN). Firstly, a method is introduced for encoding continuous-space multi-UAV problem data into spatial graphs which establish essential relationships among agents, obstacles, and targets. Secondly, a Graph AttenTion network (GAT) model is presented, which focuses exclusively on adjacent nodes, learns attention weights adaptively and allows agents to better process information in dynamic environments. Reward functions are specifically designed to tackle exploration challenges in environments with sparse rewards. By introducing a framework that integrates centralized training and distributed execution, the advancement of models is facilitated. Simulation results show that the proposed method outperforms the existing MARL method in search rate and tracking performance with less collisions. The experiments show that the proposed method can be extended to applications with a larger number of agents, which provides a potential solution to the challenging problem of multi-UAV autonomous tracking in dynamic unknown environments. 展开更多
关键词 Unmanned aerial vehicle(UAV) multi-agent reinforcement learning(MARL) Graph attention network(GAT) Tracking Dynamic and unknown environment
原文传递
Road Pricing Design Based on Game Theory and Multi-agent Consensus 被引量:2
15
作者 Nan Xiao Xuehe Wang +3 位作者 Lihua Xie Tichakorn Wongpiromsarn Emilio Frazzoli Daniela Rus 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI 2014年第1期31-39,共9页
Consensus theory and noncooperative game theory respectively deal with cooperative and noncooperative interactions among multiple players/agents. They provide a natural framework for road pricing design, since each mo... Consensus theory and noncooperative game theory respectively deal with cooperative and noncooperative interactions among multiple players/agents. They provide a natural framework for road pricing design, since each motorist may myopically optimize his or her own utility as a function of road price and collectively communicate with his or her friends and neighbors on traffic situation at the same time. This paper considers the road pricing design by using game theory and consensus theory. For the case where a system supervisor broadcasts information on the overall system to each agent, we present a variant of standard fictitious play called average strategy fictitious play(ASFP) for large-scale repeated congestion games.Only a weighted running average of all other players actions is assumed to be available to each player. The ASFP reduces the burden of both information gathering and information processing for each player. Compared to the joint strategy fictitious play(JSFP) studied in the literature, the updating process of utility functions for each player is avoided. We prove that there exists at least one pure strategy Nash equilibrium for the congestion game under investigation, and the players actions generated by the ASFP with inertia(players reluctance to change their previous actions) converge to a Nash equilibrium almost surely. For the case without broadcasting, a consensus protocol is introduced for individual agents to estimate the percentage of players choosing each resource, and the convergence property of players action profile is still ensured. The results are applied to road pricing design to achieve socially local optimal trip timing. Simulation results are provided based on the real traffic data for the Singapore case study. 展开更多
关键词 AVERAGE strategy fictitious play(ASFP) gametheory multi-agent CONSENSUS ROAD PRICING
暂未订购
Multi-agent graphical games with input constraints:an online learning solution 被引量:3
16
作者 Tianxiang WANG Bingchang WANG Yong LIANG 《Control Theory and Technology》 EI CSCD 2020年第2期148-159,共12页
This paper studies an online iterative algorithm for solving discrete-time multi-agent dynamic graphical games with input constraints.In order to obtain the optimal strategy of each agent,it is necessary to solve a se... This paper studies an online iterative algorithm for solving discrete-time multi-agent dynamic graphical games with input constraints.In order to obtain the optimal strategy of each agent,it is necessary to solve a set of coupled Hamilton-Jacobi-Bellman(HJB)equations.It is very difficult to solve HJB equations by the traditional method.The relevant game problem will become more complex if the control input of each agent in the dynamic graphical game is constrained.In this paper,an online iterative algorithm is proposed to find the online solution to dynamic graphical game without the need for drift dynamics of agents.Actually,this algorithm is to find the optimal solution of Bellman equations online.This solution employs a distributed policy iteration process,using only the local information available to each agent.It can be proved that under certain conditions,when each agent updates its own strategy simultaneously,the whole multi-agent system will reach Nash equilibrium.In the process of algorithm implementation,for each agent,two layers of neural networks are used to fit the value function and control strategy,respectively.Finally,a simulation example is given to show the effectiveness of our method. 展开更多
关键词 Actor-critic algorithm differential games input constraints neural network(NN) reinforcement learning(RL)
原文传递
Dynamic Decoupling-Driven Cooperative Pursuit for Multi-UAV Systems:A Multi-Agent Reinforcement Learning Policy Optimization Approach
17
作者 Lei Lei Chengfu Wu Huaimin Chen 《Computers, Materials & Continua》 2025年第10期1339-1363,共25页
This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing... This paper proposes a Multi-Agent Attention Proximal Policy Optimization(MA2PPO)algorithm aiming at the problems such as credit assignment,low collaboration efficiency and weak strategy generalization ability existing in the cooperative pursuit tasks of multiple unmanned aerial vehicles(UAVs).Traditional algorithms often fail to effectively identify critical cooperative relationships in such tasks,leading to low capture efficiency and a significant decline in performance when the scale expands.To tackle these issues,based on the proximal policy optimization(PPO)algorithm,MA2PPO adopts the centralized training with decentralized execution(CTDE)framework and introduces a dynamic decoupling mechanism,that is,sharing the multi-head attention(MHA)mechanism for critics during centralized training to solve the credit assignment problem.This method enables the pursuers to identify highly correlated interactions with their teammates,effectively eliminate irrelevant and weakly relevant interactions,and decompose large-scale cooperation problems into decoupled sub-problems,thereby enhancing the collaborative efficiency and policy stability among multiple agents.Furthermore,a reward function has been devised to facilitate the pursuers to encircle the escapee by combining a formation reward with a distance reward,which incentivizes UAVs to develop sophisticated cooperative pursuit strategies.Experimental results demonstrate the effectiveness of the proposed algorithm in achieving multi-UAV cooperative pursuit and inducing diverse cooperative pursuit behaviors among UAVs.Moreover,experiments on scalability have demonstrated that the algorithm is suitable for large-scale multi-UAV systems. 展开更多
关键词 multi-agent reinforcement learning multi-UAV systems pursuit-evasion games
在线阅读 下载PDF
A Research Platform of Multi-agent System Robot Soccer Game 被引量:1
18
作者 薄喜柱 Hong Bingrong 《High Technology Letters》 EI CAS 2000年第4期20-24,共5页
A soccer robot system (HIT 1) was built to participate in MIROSOT_China99 held in Harbin Institute of Technology. Robot soccer game is a very complex robot application that incorporates real time vision system, robot ... A soccer robot system (HIT 1) was built to participate in MIROSOT_China99 held in Harbin Institute of Technology. Robot soccer game is a very complex robot application that incorporates real time vision system, robot control, wireless communication and control of multiple robots. In the paper, we present the design and the hardware architecture and software architecture of our distributed multiple robot system. 展开更多
关键词 Multi agent system MIROSOT Soccer Robot Robot soccer game
在线阅读 下载PDF
Game-theoretic multi-agent motion planning in a mixed environment
19
作者 Xiaoxue Zhang Lihua Xie 《Control Theory and Technology》 EI CSCD 2024年第3期379-393,共15页
The motion planning problem for multi-agent systems becomes particularly challenging when humans or human-controlled robots are present in a mixed environment.To address this challenge,this paper presents an interacti... The motion planning problem for multi-agent systems becomes particularly challenging when humans or human-controlled robots are present in a mixed environment.To address this challenge,this paper presents an interaction-aware motion planning approach based on game theory in a receding-horizon manner Leveraging the framework provided by dynamic potential games for handling the interactions among agents,this approach formulates the multi-agent motion planning problem as a differential potential game,highlighting the effectiveness of constrained potential games in facilitating interactive motion planning among agents.Furthermore,online learning techniques are incorporated to dynamically learn the unknown preferences and models of humans or human-controlled robots through the analysis of observed data.To evaluate the effectiveness of the proposed approach,numerical simulations are conducted,demonstrating its capability to generate interactive trajectories for all agents,including humans and human-controlled agents,operating within the mixed environment.The simulation results illustrate the effectiveness of the proposed approach in handling the complexities of multi-agent motion planning in real-world scenarios. 展开更多
关键词 Motion planning Differential potential game multi-agent systems Constrained potential game
原文传递
Research on decision-making behavior of multi-agent alliance in cross-border electricity market environment: an evolutionary game
20
作者 Zhao Luo Chenming Dong +3 位作者 Xinrui Dai Hua Wang Guihong Bi Xin Shen 《Global Energy Interconnection》 EI CSCD 2024年第6期707-722,共16页
Constructing a cross-border power energy system with multiagent power energy as an alliance is important for studying cross-border power-trading markets.This study considers multiple neighboring countries in the form ... Constructing a cross-border power energy system with multiagent power energy as an alliance is important for studying cross-border power-trading markets.This study considers multiple neighboring countries in the form of alliances,introduces neighboring countries’exchange rates into the cross-border multi-agent power-trading market and proposes a method to study each agent’s dynamic decision-making behavior based on evolutionary game theory.To this end,this study uses three national agents as examples,constructs a tripartite evolutionary game model,and analyzes the evolution process of the decision-making behavior of each agent member state under the initial willingness value,cost of payment,and additional revenue of the alliance.This research helps realize cross-border energy operations so that the transaction agent can achieve greater trade profits and provides a theoretical basis for cooperation and stability between multiple agents. 展开更多
关键词 multi-agent alliance Cross-border transactions Electricity market Evolutionary game DECISION-MAKING
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部