期刊文献+
共找到11,658篇文章
< 1 2 250 >
每页显示 20 50 100
Novel multi-agent action masked deep reinforcement learning for general industrial assembly lines balancing problems
1
作者 Ali M.Ali Luca Tirel Hashim A.Hashim 《Journal of Automation and Intelligence》 2025年第4期299-311,共13页
Efficient planning of activities is essential for modern industrial assembly lines to uphold manufacturing standards,prevent project constraint violations,and achieve cost-effective operations.While exact solutions to... Efficient planning of activities is essential for modern industrial assembly lines to uphold manufacturing standards,prevent project constraint violations,and achieve cost-effective operations.While exact solutions to such challenges can be obtained through Integer Programming(IP),the dependence of the search space on input parameters often makes IP computationally infeasible for large-scale scenarios.Heuristic methods,such as Genetic Algorithms,can also be applied,but they frequently produce suboptimal solutions in extensive cases.This paper introduces a novel mathematical model of a generic industrial assembly line formulated as a Markov Decision Process(MDP),without imposing assumptions on the type of assembly line a notable distinction from most existing models.The proposed model is employed to create a virtual environment for training Deep Reinforcement Learning(DRL)agents to optimize task and resource scheduling.To enhance the efficiency of agent training,the paper proposes two innovative tools.The first is an action-masking technique,which ensures the agent selects only feasible actions,thereby reducing training time.The second is a multi-agent approach,where each workstation is managed by an individual agent,as a result,the state and action spaces were reduced.A centralized training framework with decentralized execution is adopted,offering a scalable learning architecture for optimizing industrial assembly lines.This framework allows the agents to learn offline and subsequently provide real-time solutions during operations by leveraging a neural network that maps the current factory state to the optimal action.The effectiveness of the proposed scheme is validated through numerical simulations,demonstrating significantly faster convergence to the optimal solution compared to a comparable model-based approach. 展开更多
关键词 Artificial intelligence in industrial engineering Autonomous decision making Distributed multi-agent learning Reinforcement learning
在线阅读 下载PDF
基于MAS(Multi-AgentSystem)的多机器人系统:协作多机器人学发展的一个重要方向 被引量:20
2
作者 陈忠泽 林良明 颜国正 《机器人》 EI CSCD 北大核心 2001年第4期368-373,共6页
机器人的应用方式正在由部件式单元应用向系统式应用方向发展 .这是实际应用的需要 ,也是技术发展的必然趋势 ;相关技术如计算机网络技术的发展也为它的实现提供了相应支持 .多机器人协作理论问题必然也已经成为机器人学研究的一个热点 ... 机器人的应用方式正在由部件式单元应用向系统式应用方向发展 .这是实际应用的需要 ,也是技术发展的必然趋势 ;相关技术如计算机网络技术的发展也为它的实现提供了相应支持 .多机器人协作理论问题必然也已经成为机器人学研究的一个热点 ,其中 ,分布式人工智能 ( DAI)中的多智能体 (代理 )系统 ( MAS:Multi-agentSystem)理论已引起多机器人协作理论研究者的关注 .本文即在揭示协作多机器人系统与 MAS的内在联系的基础上 ,指出基于 MAS的协作多机器人系统是协作多机器人学发展的一个重要方向 . 展开更多
关键词 多机器人系统 多智能体系系统 协作多机器人学 mas 人工智能
在线阅读 下载PDF
基于MAS的梯田非粮化农户行为决策机制与模拟
3
作者 后莉 裴婷婷 +2 位作者 陈英 谢保鹏 席瑞云 《农业资源与环境学报》 北大核心 2026年第1期104-117,共14页
为探究农户梯田非粮化行为运行逻辑,本研究选取3个典型研究区:果粮复合型(区1)、粮作撂荒混合型(区2)、苹果主导型(区3),基于多智能体系统(MAS),结合实地调研和多情景模拟,探究了甘肃陇中陇东地区农户在梯田利用决策中的行为机制。结果... 为探究农户梯田非粮化行为运行逻辑,本研究选取3个典型研究区:果粮复合型(区1)、粮作撂荒混合型(区2)、苹果主导型(区3),基于多智能体系统(MAS),结合实地调研和多情景模拟,探究了甘肃陇中陇东地区农户在梯田利用决策中的行为机制。结果表明:农户梯田利用行为决策的内在机制是以追求经济效益最大化为主要目标,由家庭资源禀赋产生更强的限制和指导作用,外部自然、社会、政策环境提供额外激励或约束的过程,其中,三个研究区内外部环境变量组合权重比值分别为:0.486∶0.514、0.575∶0.425和0.538∶0.462。陇中陇东地区农户梯田利用决策行为呈现以非粮利用为主导、粮食生产为辅的趋势,三个研究区非粮化利用最终决策值分别为0.852、0.842、0.942。研究区农户对各梯田利用方式感知度、反馈值、决策值及主要环境变量具有空间异质性。多情景模拟中,粮食生产激励政策对提高农户粮食作物决策值具有显著正向影响,非粮化市场饱和能有效抑制经济作物型非粮化,吸引劳动力回流可有效缓解撂荒现象。最后,提出针对性的农业非粮化格局优化策略,为促进农村可持续发展提供参考。 展开更多
关键词 梯田非粮化 农户 行为决策 多智能体系统(mas) 陇中陇东地区
在线阅读 下载PDF
Leader-following positive consensus of heterogeneous switched multi-agent systems with average dwell time switching
4
作者 Kaiming Li Wei Xing +1 位作者 Haoyue Yang Junfeng Zhang 《Control Theory and Technology》 2026年第1期66-81,共16页
This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consen... This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consensus under average dwell time switching.Then sufficient conditions are derived to guarantee the positive consensus.The gain matrices of the control protocol are described using a matrix decomposition approach and the corresponding computational complexity is reduced by resorting to linear programming and co-positive Lyapunov functions.Finally,two numerical examples are provided to illustrate the results obtained. 展开更多
关键词 Heterogeneous switched multi-agent systems Positive consensus Linear programming
原文传递
Research on UAV-MEC Cooperative Scheduling Algorithms Based on Multi-Agent Deep Reinforcement Learning
5
作者 Yonghua Huo Ying Liu +1 位作者 Anni Jiang Yang Yang 《Computers, Materials & Continua》 2026年第3期1823-1850,共28页
With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier... With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics. 展开更多
关键词 UAV-MEC networks multi-agent deep reinforcement learning maTD3 task offloading
在线阅读 下载PDF
Finite-time fault-tolerant tracking control for multi-agent systems based on neural observer
6
作者 Junzhe Cheng Shitong Zhang +1 位作者 Qing Wang Bin Xin 《Control Theory and Technology》 2026年第1期10-23,共14页
This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external di... This paper investigates the consensus tracking control problem for high order nonlinear multi-agent systems subject to non-affine faults,partial measurable states,uncertain control coefficients,and unknown external disturbances.Under the directed topology conditions,an observer-based finite-time control strategy based on adaptive backstepping and is proposed,in which a neural network-based state observer is employed to approximate the unmeasurable system state variables.To address the complexity explosion problem associated with the backstepping method,a finite-time command filter is incorporated,with error compensation signals designed to mitigate the filter-induced errors.Additionally,the Butterworth low-pass filter is introduced to avoid the algebraic ring problem in the design of the controller.The finite-time stability of the closed-loop system is rigorously analyzed with the finite-time Lyapunov stability criterion,validating that all closed-loop signals of the system remain bounded within a finite time.Finally,the effectiveness of the proposed control strategy is verified through a simulation example. 展开更多
关键词 multi-agent systems Command filtered backstepping Finite-time control Neural observer Non-affine faults
原文传递
MultiAgent-CoT:A Multi-Agent Chain-of-Thought Reasoning Model for Robust Multimodal Dialogue Understanding
7
作者 Ans D.Alghamdi 《Computers, Materials & Continua》 2026年第2期1395-1429,共35页
Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal ... Multimodal dialogue systems often fail to maintain coherent reasoning over extended conversations and suffer from hallucination due to limited context modeling capabilities.Current approaches struggle with crossmodal alignment,temporal consistency,and robust handling of noisy or incomplete inputs across multiple modalities.We propose Multi Agent-Chain of Thought(CoT),a novel multi-agent chain-of-thought reasoning framework where specialized agents for text,vision,and speech modalities collaboratively construct shared reasoning traces through inter-agent message passing and consensus voting mechanisms.Our architecture incorporates self-reflection modules,conflict resolution protocols,and dynamic rationale alignment to enhance consistency,factual accuracy,and user engagement.The framework employs a hierarchical attention mechanism with cross-modal fusion and implements adaptive reasoning depth based on dialogue complexity.Comprehensive evaluations on Situated Interactive Multi-Modal Conversations(SIMMC)2.0,VisDial v1.0,and newly introduced challenging scenarios demonstrate statistically significant improvements in grounding accuracy(p<0.01),chain-of-thought interpretability,and robustness to adversarial inputs compared to state-of-the-art monolithic transformer baselines and existing multi-agent approaches. 展开更多
关键词 multi-agent systems chain-of-thought reasoning multimodal dialogue conversational artificial intelligence(AI) cross-modal fusion reasoning Interpretability
在线阅读 下载PDF
Distributed optimal formation control of heterogeneous Euler–Lagrange multi-agent systems
8
作者 Mengmeng Duan Fengping Huang +2 位作者 Shanying Zhu Ziwen Yang Cailian Chen 《Journal of Automation and Intelligence》 2025年第4期282-290,共9页
In this paper,the distributed optimal formation control problem of heterogeneous Euler–Lagrange multi-agent systems with generic formation constraints and inequality constraints is investigated.Based on the primal–d... In this paper,the distributed optimal formation control problem of heterogeneous Euler–Lagrange multi-agent systems with generic formation constraints and inequality constraints is investigated.Based on the primal–dual dynamics and the adaptive control technique,a distributed optimal formation controller consists of a velocity reference signal generator and a velocity tracking controller is proposed.By using the optimality condition,the relationship between the equilibrium point of the closed-loop system and the optimal solution of the optimization problem is established.Then,by utilizing Lyapunov stability analysis,it is rigorously proved that the optimal formation is reached with the proposed controller.Lastly,simulation examples are provided to substantiate the theoretical results. 展开更多
关键词 Formation control Distributed optimization multi-agent systems
在线阅读 下载PDF
Group formation tracking for heterogeneous linear multi-agent systems under switching topologies
9
作者 Shiyu Zhou Dong Sun 《Journal of Automation and Intelligence》 2025年第2期108-114,共7页
This article investigates the time-varying output group formation tracking control(GFTC)problem for heterogeneous multi-agent systems(HMASs)under switching topologies.The objective is to design a distributed control s... This article investigates the time-varying output group formation tracking control(GFTC)problem for heterogeneous multi-agent systems(HMASs)under switching topologies.The objective is to design a distributed control strategy that enables the outputs of the followers to form the desired sub-formations and track the outputs of the leader in each subgroup.Firstly,novel distributed observers are developed to estimate the states of the leaders under switching topologies.Then,GFTC protocols are designed based on the proposed observers.It is shown that with the distributed protocol,the GFTC problem for HMASs under switching topologies is solved if the average dwell time associated with the switching topologies is larger than a fixed threshold.Finally,an example is provided to illustrate the effectiveness of the proposed control strategy. 展开更多
关键词 Formation tracking Group division Switching topologies multi-agent systems
在线阅读 下载PDF
Recent Advancement in Formation Control of Multi-Agent Systems:A Review
10
作者 Aamir Farooq Zhengrong Xiang +1 位作者 Wen-Jer Chang Muhammad Shamrooz Aslam 《Computers, Materials & Continua》 2025年第6期3623-3674,共52页
Formation control in multi-agent systems has become a critical area of interest due to its wide-ranging applications in robotics,autonomous transportation,and surveillance.While various studies have explored distribut... Formation control in multi-agent systems has become a critical area of interest due to its wide-ranging applications in robotics,autonomous transportation,and surveillance.While various studies have explored distributed cooperative control,this review focuses on the theoretical foundations and recent developments in formation control strategies.The paper categorizes and analyzes key formation types,including formation maintenance,group or cluster formation,bipartite formations,event-triggered formations,finite-time convergence,and constrained formations.A significant portion of the review addresses formation control under constrained dynamics,presenting both modelbased and model-free approaches that consider practical limitations such as actuator bounds,communication delays,and nonholonomic constraints.Additionally,the paper discusses emerging trends,including the integration of eventdriven mechanisms and AI-enhanced coordination strategies.Comparative evaluations highlight the trade-offs among various methodologies regarding scalability,robustness,and real-world feasibility.Practical implementations are reviewed across diverse platforms,and the review identifies the current achievements and unresolved challenges in the field.The paper concludes by outlining promising research directions,such as adaptive control for dynamic environments,energy-efficient coordination,and using learning-based control under uncertainty.This review synthesizes the current state of the art and provides a road map for future investigation,making it a valuable reference for researchers and practitioners aiming to advance formation control in multi-agent systems. 展开更多
关键词 Cooperative control multi-agent systems formation control formation containment group formation bipartite formation
在线阅读 下载PDF
Unified Output Feedback Based Prescribed Performance Consensus Tracking Control of Heterogeneous Multi-Agent Systems
11
作者 Dahui Luo Yujuan Wang +1 位作者 Frank L.Lewis Yongduan Song 《IEEE/CAA Journal of Automatica Sinica》 2025年第8期1636-1647,共12页
This paper proposes an output-feedback based prescribed performance consensus tracking control methodology for a class of heterogeneous multi-agent systems(HMASs)with inconsistent system structure,where the performanc... This paper proposes an output-feedback based prescribed performance consensus tracking control methodology for a class of heterogeneous multi-agent systems(HMASs)with inconsistent system structure,where the performance behavior is allowed to be different from that of each other.Both the heterogeneous system structures and the nonidentical performance requirements make the control problem much more challenging than that of MASs with identical structure and performance requirement.This is mainly due to the coupling effect of the system dynamics and performance restriction of each agent in the cooperative control action.The key to solve this problem is to introduce a dual-phase performance-guaranteed method,in which the consensus tracking error is decomposed into auxiliary tracking error and filter tracking error and then the whole performance control is decomposed into two phases.By confining the two errors respectively,the practical tracking error can be proved to be explicitly confined within an arbitrarily given performance envelope by merely adjusting the design parameters rather than modifying control structure.Moreover,the prescribed performance control(PPC)result is not only uniform with any initial conditions and design parameters,allowing it to be global,but also unifying both the global and semi-global result into one frame,distinguishing itself from most existing PPC works where either only global or only semi-global result is guaranteed.Finally,the effectiveness of the proposed control scheme is confirmed by the simulation conducted on a group of tunnel-diode circuits(TDC). 展开更多
关键词 Auxiliary filter heterogeneous multi-agent systems(Hmass) OUTPUT-FEEDBACK prescribed performance
在线阅读 下载PDF
Defending Against Jamming and Interference for Internet of UAVs Using Cooperative Multi-Agent Reinforcement Learning with Mutual Information
12
作者 Lin Yan Wu Zhijuan +4 位作者 Peng Nuoheng Zhao Tianyu Zhang Yijin Shu Feng Li Jun 《China Communications》 2025年第5期220-237,共18页
The Internet of Unmanned Aerial Vehicles(I-UAVs)is expected to execute latency-sensitive tasks,but limited by co-channel interference and malicious jamming.In the face of unknown prior environmental knowledge,defendin... The Internet of Unmanned Aerial Vehicles(I-UAVs)is expected to execute latency-sensitive tasks,but limited by co-channel interference and malicious jamming.In the face of unknown prior environmental knowledge,defending against jamming and interference through spectrum allocation becomes challenging,especially when each UAV pair makes decisions independently.In this paper,we propose a cooperative multi-agent reinforcement learning(MARL)-based anti-jamming framework for I-UAVs,enabling UAV pairs to learn their own policies cooperatively.Specifically,we first model the problem as a modelfree multi-agent Markov decision process(MAMDP)to maximize the long-term expected system throughput.Then,for improving the exploration of the optimal policy,we resort to optimizing a MARL objective function with a mutual-information(MI)regularizer between states and actions,which can dynamically assign the probability for actions frequently used by the optimal policy.Next,through sharing their current channel selections and local learning experience(their soft Q-values),the UAV pairs can learn their own policies cooperatively relying on only preceding observed information and predicting others’actions.Our simulation results show that for both sweep jamming and Markov jamming patterns,the proposed scheme outperforms the benchmarkers in terms of throughput,convergence and stability for different numbers of jammers,channels and UAV pairs. 展开更多
关键词 anti-jamming communication internet of UAVs multi-agent reinforcement learning spectrum allocation
在线阅读 下载PDF
Observer-based prescribed-time time-varying output formation-containment control of heterogeneous multi-agent systems
13
作者 Haiyang Hu Tao Li +3 位作者 Xiaowen Zhao Yuanmei Wang Jialong Tian Zijie Jiang 《Chinese Physics B》 2025年第10期366-375,共10页
This paper investigates the observer-based prescribed-time time-varying output formation-containment(PT-TV-OFC)control problem for heterogeneous multi-agent systems in which the different agents have different state d... This paper investigates the observer-based prescribed-time time-varying output formation-containment(PT-TV-OFC)control problem for heterogeneous multi-agent systems in which the different agents have different state dimensions.The system comprises one tracking leader,multiple formation leaders,and followers,where two types of leaders are used to generate a reference trajectory for movement and achieve specific formation,respectively.Firstly,a prescribed-time dynamics observer is constructed for the formation leaders to estimate the tracking leader's dynamic model and state.On this basis,a prescribed-time control protocol is designed for the formation leaders to achieve time-varying output formation.Then,a prescribed-time convex hull observer is designed for the followers to estimate information regarding the convex hull formed by the formation leaders.Using the estimated convex hull information,a prescribed-time containment control protocol is designed to ensure the followers converge into the convex hull.Furthermore,using Lyapunov stability theory,the stability of systems is proved in detail,which implies that the heterogeneous multi-agent systems can achieve PT-TV-OFC control.Finally,numerical simulations validate the feasibility of the theoretical results. 展开更多
关键词 heterogeneous multi-agent systems prescribed-time control observers time-varying output formation-containment control
原文传递
Multi-Agent Reinforcement Learning for Moving Target Defense Temporal Decision-Making Approach Based on Stackelberg-FlipIt Games
14
作者 Rongbo Sun Jinlong Fei +1 位作者 Yuefei Zhu Zhongyu Guo 《Computers, Materials & Continua》 2025年第8期3765-3786,共22页
Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,... Moving Target Defense(MTD)necessitates scientifically effective decision-making methodologies for defensive technology implementation.While most MTD decision studies focus on accurately identifying optimal strategies,the issue of optimal defense timing remains underexplored.Current default approaches—periodic or overly frequent MTD triggers—lead to suboptimal trade-offs among system security,performance,and cost.The timing of MTD strategy activation critically impacts both defensive efficacy and operational overhead,yet existing frameworks inadequately address this temporal dimension.To bridge this gap,this paper proposes a Stackelberg-FlipIt game model that formalizes asymmetric cyber conflicts as alternating control over attack surfaces,thereby capturing the dynamic security state evolution of MTD systems.We introduce a belief factor to quantify information asymmetry during adversarial interactions,enhancing the precision of MTD trigger timing.Leveraging this game-theoretic foundation,we employMulti-Agent Reinforcement Learning(MARL)to derive adaptive temporal strategies,optimized via a novel four-dimensional reward function that holistically balances security,performance,cost,and timing.Experimental validation using IP addressmutation against scanning attacks demonstrates stable strategy convergence and accelerated defense response,significantly improving cybersecurity affordability and effectiveness. 展开更多
关键词 Cyber security moving target defense multi-agent reinforcement learning security metrics game theory
在线阅读 下载PDF
Optimal condition analysis of target localization using multi-agents with uncertain positions
15
作者 Yi Hou Ning Hao +2 位作者 Fenghua He Chen Xie Yu Yao 《Control Theory and Technology》 2025年第1期131-144,共14页
This paper delves into the problem of optimal placement conditions for a group of agents collaboratively localizing a target using range-only or bearing-only measurements.The challenge in this study stems from the unc... This paper delves into the problem of optimal placement conditions for a group of agents collaboratively localizing a target using range-only or bearing-only measurements.The challenge in this study stems from the uncertainty associated with the positions of the agents,which may experience drift or disturbances during the target localization process.Initially,we derive the Cramer-Rao lower bound(CRLB)of the target position as the primary analytical metric.Subsequently,we establish the necessary and sufficient conditions for the optimal placement of agents.Based on these conditions,we analyze the maximal allowable agent position error for an expected mean squared error(MSE),providing valuable guidance for the selection of agent positioning sensors.The analytical findings are further validated through simulation experiments. 展开更多
关键词 Cramer-Rao lower bound(CRLB) Target localization Uncertain sensor position multi-agent systems
原文传递
Formation-containment control for nonholonomic multi-agent systems with a desired trajectory constraint
16
作者 GU Xueqiang LU Lina +1 位作者 XIANG Fengtao ZHANG Wanpeng 《Journal of Systems Engineering and Electronics》 2025年第1期256-268,共13页
This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired traje... This paper addresses the time-varying formation-containment(FC) problem for nonholonomic multi-agent systems with a desired trajectory constraint, where only the leaders can acquire information about the desired trajectory. Input the fixed time-varying formation template to the leader and start executing, this process also needs to track the desired trajectory, and the follower needs to converge to the convex hull that the leader crosses. Firstly, the dynamic models of nonholonomic systems are linearized to second-order dynamics. Then, based on the desired trajectory and formation template, the FC control protocols are proposed. Sufficient conditions to achieve FC are introduced and an algorithm is proposed to resolve the control parameters by solving an algebraic Riccati equation. The system is demonstrated to achieve FC, with the average position and velocity of the leaders converging asymptotically to the desired trajectory. Finally, the theoretical achievements are verified in simulations by a multi-agent system composed of virtual human individuals. 展开更多
关键词 multi-agent systems nonholonomic dynamics formation-containment(FC)control desired trajectory constrains
在线阅读 下载PDF
Hybrid quantum–classical multi-agent decision-making framework based on hierarchical Bayesian networks in the noisy intermediate-scale quantum era
17
作者 Hao Shi Chenghao Han +1 位作者 Peng Wang Ming Zhang 《Chinese Physics B》 2025年第12期61-74,共14页
Although quantum Bayesian networks provide a promising paradigm for multi-agent decision-making,their practical application faces two challenges in the noisy intermediate-scale quantum(NISQ)era.Limited qubit resources... Although quantum Bayesian networks provide a promising paradigm for multi-agent decision-making,their practical application faces two challenges in the noisy intermediate-scale quantum(NISQ)era.Limited qubit resources restrict direct application to large-scale inference tasks.Additionally,no quantum methods are currently available for multi-agent collaborative decision-making.To address these,we propose a hybrid quantum–classical multi-agent decision-making framework based on hierarchical Bayesian networks,comprising two novel methods.The first one is a hybrid quantum–classical inference method based on hierarchical Bayesian networks.It decomposes large-scale hierarchical Bayesian networks into modular subnetworks.The inference for each subnetwork can be performed on NISQ devices,and the intermediate results are converted into classical messages for cross-layer transmission.The second one is a multi-agent decision-making method using the variational quantum eigensolver(VQE)in the influence diagram.This method models the collaborative decision-making with the influence diagram and encodes the expected utility of diverse actions into a Hamiltonian and subsequently determines the intra-group optimal action efficiently.Experimental validation on the IonQ quantum simulator demonstrates that the hierarchical method outperforms the non-hierarchical method at the functional inference level,and the VQE method can obtain the optimal strategy exactly at the collaborative decision-making level.Our research not only extends the application of quantum computing to multi-agent decision-making but also provides a practical solution for the NISQ era. 展开更多
关键词 quantum Bayesian networks multi-agent decision-making hybrid quantum–classical algorithms hierarchical Bayesian networks
原文传递
Adaptive multi-agent reinforcement learning for dynamic pricing and distributed energy management in virtual power plant networks
18
作者 Jian-Dong Yao Wen-Bin Hao +3 位作者 Zhi-Gao Meng Bo Xie Jian-Hua Chen Jia-Qi Wei 《Journal of Electronic Science and Technology》 2025年第1期35-59,共25页
This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards grea... This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards greater decentralization and renewable integration,traditional optimization methods struggle to address the inherent complexities and uncertainties.Our proposed MARL framework enables adaptive,decentralized decision-making for both the distribution system operator and individual VPPs,optimizing economic efficiency while maintaining grid stability.We formulate the problem as a Markov decision process and develop a custom MARL algorithm that leverages actor-critic architectures and experience replay.Extensive simulations across diverse scenarios demonstrate that our approach consistently outperforms baseline methods,including Stackelberg game models and model predictive control,achieving an 18.73%reduction in costs and a 22.46%increase in VPP profits.The MARL framework shows particular strength in scenarios with high renewable energy penetration,where it improves system performance by 11.95%compared with traditional methods.Furthermore,our approach demonstrates superior adaptability to unexpected events and mis-predictions,highlighting its potential for real-world implementation. 展开更多
关键词 Distributed energy management Dynamic pricing multi-agent reinforcement learning Renewable energy integration Virtual power plants
在线阅读 下载PDF
Computational Design of Interval Type-2 Fuzzy Control for Formation and Containment of Multi-Agent Systems with Collision Avoidance Capability
19
作者 Yann-Horng Lin Wen-Jer Chang +2 位作者 Yi-Chen Lee Muhammad Shamrooz Aslam Cheung-Chieh Ku 《Computer Modeling in Engineering & Sciences》 2025年第8期2231-2262,共32页
An Interval Type-2(IT-2)fuzzy controller design approach is proposed in this research to simultaneously achievemultiple control objectives inNonlinearMulti-Agent Systems(NMASs),including formation,containment,and coll... An Interval Type-2(IT-2)fuzzy controller design approach is proposed in this research to simultaneously achievemultiple control objectives inNonlinearMulti-Agent Systems(NMASs),including formation,containment,and collision avoidance.However,inherent nonlinearities and uncertainties present in practical control systems contribute to the challenge of achieving precise control performance.Based on the IT-2 Takagi-Sugeno Fuzzy Model(T-SFM),the fuzzy control approach can offer a more effective solution for NMASs facing uncertainties.Unlike existing control methods for NMASs,the Formation and Containment(F-and-C)control problem with collision avoidance capability under uncertainties based on the IT-2 T-SFM is discussed for the first time.Moreover,an IT-2 fuzzy tracking control approach is proposed to solve the formation task for leaders in NMASs without requiring communication.This control scheme makes the design process of the IT-2 fuzzy Formation Controller(FC)more straightforward and effective.According to the communication interaction protocol,the IT-2 Containment Controller(CC)design approach is proposed for followers to ensure convergence into the region defined by the leaders.Leveraging the IT-2 T-SFM representation,the analysis methods developed for linear Multi-Agent Systems(MASs)are successfully extended to perform containment analysis without requiring the additional assumptions imposed in existing research.Notably,the IT-2 fuzzy tracking controller can also be applied in collision avoidance situations to track the desired trajectories calculated by the avoidance algorithm under the Artificial Potential Field(APF).Benefiting from the combination of vortex and source APFs,the leaders can properly adjust the system dynamics to prevent potential collision risk.Integrating the fuzzy theory and APFs avoidance algorithm,an IT-2 fuzzy controller design approach is proposed to achieve the F-and-C purposewhile ensuring collision avoidance capability.Finally,amulti-ship simulation is conducted to validate the feasibility and effectiveness of the designed IT-2 fuzzy controller. 展开更多
关键词 Interval type-2 Takagi-Sugeno fuzzy model multi-agent systems formation and containment control fuzzy collision avoidance artificial potential field
在线阅读 下载PDF
MARCS:A Mobile Crowdsensing Framework Based on Data Shapley Value Enabled Multi-Agent Deep Reinforcement Learning
20
作者 Yiqin Wang Yufeng Wang +1 位作者 Jianhua Ma Qun Jin 《Computers, Materials & Continua》 2025年第3期4431-4449,共19页
Opportunistic mobile crowdsensing(MCS)non-intrusively exploits human mobility trajectories,and the participants’smart devices as sensors have become promising paradigms for various urban data acquisition tasks.Howeve... Opportunistic mobile crowdsensing(MCS)non-intrusively exploits human mobility trajectories,and the participants’smart devices as sensors have become promising paradigms for various urban data acquisition tasks.However,in practice,opportunistic MCS has several challenges from both the perspectives of MCS participants and the data platform.On the one hand,participants face uncertainties in conducting MCS tasks,including their mobility and implicit interactions among participants,and participants’economic returns given by the MCS data platform are determined by not only their own actions but also other participants’strategic actions.On the other hand,the platform can only observe the participants’uploaded sensing data that depends on the unknown effort/action exerted by participants to the platform,while,for optimizing its overall objective,the platform needs to properly reward certain participants for incentivizing them to provide high-quality data.To address the challenge of balancing individual incentives and platform objectives in MCS,this paper proposes MARCS,an online sensing policy based on multi-agent deep reinforcement learning(MADRL)with centralized training and decentralized execution(CTDE).Specifically,the interactions between MCS participants and the data platform are modeled as a partially observable Markov game,where participants,acting as agents,use DRL-based policies to make decisions based on local observations,such as task trajectories and platform payments.To align individual and platform goals effectively,the platform leverages Shapley value to estimate the contribution of each participant’s sensed data,using these estimates as immediate rewards to guide agent training.The experimental results on real mobility trajectory datasets indicate that the revenue of MARCS reaches almost 35%,53%,and 100%higher than DDPG,Actor-Critic,and model predictive control(MPC)respectively on the participant side and similar results on the platform side,which show superior performance compared to baselines. 展开更多
关键词 Mobile crowdsensing online data acquisition data Shapley value multi-agent deep reinforcement learning centralized training and decentralized execution(CTDE)
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部