期刊文献+
共找到10,195篇文章
< 1 2 250 >
每页显示 20 50 100
Multi-agent communication-based train control system for Indian railways: the behavioural analysis 被引量:2
1
作者 Anshul Verma K. K. Pattanaik 《Journal of Modern Transportation》 2015年第4期272-286,共15页
Multi-agent technology has been used in many complex distributed and concurrent systems. A railway system is such a safety critical system and careful inves- tigation of the functional components is very important. St... Multi-agent technology has been used in many complex distributed and concurrent systems. A railway system is such a safety critical system and careful inves- tigation of the functional components is very important. Study of the various functional components in communi- cation-based train control (CBTC) system necessitates a good structural design followed by its validation and ver- ification through a formal modelling technique. The work presented here is the follow up of our multi-agent-based CBTC system for Indian railway designed using the methodology for engineering system of software agents. Behavioural analysis of the designed system involves several operating scenarios that arise during train run, and helps in understanding the reaction of the system to such situations. This validation and verification are very important as it allows the system designer to critically evaluate the desired function of the system and to correct the design errors, if any, before its actual implementation. Modelling, validation and verification of the structural design through Coloured petri net (CPN) are central to this paper. Analysis of simulation results validates the efficacy of the design. 展开更多
关键词 CBTC multi-agent Fault resolution Modelling Validation and verification CPN
在线阅读 下载PDF
Targeted multi-agent communication algorithm based on state control
2
作者 Li-yang Zhao Tian-qing Chang +3 位作者 Lei Zhang Jie Zhang Kai-xuan Chu De-peng Kong 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第1期544-556,共13页
As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication ... As an important mechanism in multi-agent interaction,communication can make agents form complex team relationships rather than constitute a simple set of multiple independent agents.However,the existing communication schemes can bring much timing redundancy and irrelevant messages,which seriously affects their practical application.To solve this problem,this paper proposes a targeted multiagent communication algorithm based on state control(SCTC).The SCTC uses a gating mechanism based on state control to reduce the timing redundancy of communication between agents and determines the interaction relationship between agents and the importance weight of a communication message through a series connection of hard-and self-attention mechanisms,realizing targeted communication message processing.In addition,by minimizing the difference between the fusion message generated from a real communication message of each agent and a fusion message generated from the buffered message,the correctness of the final action choice of the agent is ensured.Our evaluation using a challenging set of Star Craft II benchmarks indicates that the SCTC can significantly improve the learning performance and reduce the communication overhead between agents,thus ensuring better cooperation between agents. 展开更多
关键词 multi-agent deep reinforcement learning State control Targeted interaction communication mechanism
在线阅读 下载PDF
Control-Communication Co-Optimization for Wireless Cloud Robotic System via Multi-Agent Transfer Reinforcement Learning
3
作者 Chi Xu Junyuan Zhang Haibin Yu 《IEEE/CAA Journal of Automatica Sinica》 2026年第2期311-326,共16页
The wireless cloud robotic system(WCRS),which fully integrates sensing,communication,computing,and control capabilities as an intelligent agent,is a promising way to achieve intelligent manufacturing due to easy deplo... The wireless cloud robotic system(WCRS),which fully integrates sensing,communication,computing,and control capabilities as an intelligent agent,is a promising way to achieve intelligent manufacturing due to easy deployment and flexible expansion.However,the high-precision control of WCRS requires deterministic wireless communication,which is always challenging in the complex and dynamic radio space.This paper employs the reconfigurable intelligent surface(RIS)to establish a novel RIS-assisted WCRS architecture,where the radio channel is controlled to achieve ultra-reliable,low-delay,and low-jitter communication for high-precision closed-loop motion control.However,control and communication are strongly coupled and should be co-optimized.Fully considering the constraints of control input threshold,control delay deadline,beam phase,antenna power,and information distortion,we establish a stability maximization problem to jointly optimize control input compensation,RIS phase shift,and beamforming.Herein,a new jitter-oriented system stability objective with respect to control error and communication jitter is defined and the closed-form expression of control delay deadline is derived based on the Jensen Inequality and Lyapunov-Krasovskii functional.Due to the time-varying and partial observability of the channel and robot states,we model the problem as a partially observable Markov decision process(POMDP).To solve this complex problem,we propose a multi-agent transfer reinforcement learning algorithm named LSTM-PPO-MATRL,where the LSTM-enhanced proximal policy optimization(PPO)is designed to approximate an optimal solution and the option-guided policy transfer learning is proposed to facilitate the learning process.By centralized training and decentralized execution,LSTM-PPO-MATRL is validated by extensive experiments on MuJoCo tasks for both low-mobility and high-mobility robotic control scenarios.The results demonstrate that LSTM-PPO-MATRL not only realizes high learning efficiency,but also supports low-delay,low-jitter communication for low error control,where 71.9%control accuracy improvement and 68.7%delay jitter reduction are achieved compared to the PPO-MADRL baseline. 展开更多
关键词 multi-agent transfer reinforcement learning(MATRL) partially observable Markov decision process(POMDP) reconfigurable intelligent surface(RIS) system stability wireless cloud robotic system(WCRS)
在线阅读 下载PDF
Multi-hop UAV relay covert communication:A multi-agent reinforcement learning approach 被引量:1
4
作者 Hengzhi BAI Haichao WANG +4 位作者 Rongrong HE Jiatao DU Guoxin LI Yuhua XU Yutao JIAO 《Chinese Journal of Aeronautics》 2025年第10期120-133,共14页
Due to the characteristics of line-of-sight(LoS)communication in unmanned aerial vehicle(UAV)networks,these systems are highly susceptible to eavesdropping and surveillance.To effectively address the security concerns... Due to the characteristics of line-of-sight(LoS)communication in unmanned aerial vehicle(UAV)networks,these systems are highly susceptible to eavesdropping and surveillance.To effectively address the security concerns in UAV communication,covert communication methods have been adopted.This paper explores the joint optimization problem of trajectory and transmission power in a multi-hop UAV relay covert communication system.Considering the communication covertness,power constraints,and trajectory limitations,an algorithm based on multi-agent proximal policy optimization(MAPPO),named covert-MAPPO(C-MAPPO),is proposed.The proposed method leverages the strengths of both optimization algorithms and reinforcement learning to analyze and make joint decisions on the transmission power and flight trajectory strategies for UAVs to achieve cooperation.Simulation results demonstrate that the proposed method can maximize the system throughput while satisfying covertness constraints,and it outperforms benchmark algorithms in terms of system throughput and reward convergence speed. 展开更多
关键词 Covert communication Unmanned aerial vehicle(UAV) Power optimization Trajectory planning multi-agent reinforcement learning(MARL)
原文传递
Dynamic Multi-Target Jamming Channel Allocation and Power Decision-Making in Wireless Communication Networks:A Multi-Agent Deep Reinforcement Learning Approach
5
作者 Peng Xiang Xu Hua +4 位作者 Qi Zisen Wang Dan Zhang Yue Rao Ning Gu Wanyi 《China Communications》 2025年第5期71-91,共21页
This paper studies the problem of jamming decision-making for dynamic multiple communication links in wireless communication networks(WCNs).We propose a novel jamming channel allocation and power decision-making(JCAPD... This paper studies the problem of jamming decision-making for dynamic multiple communication links in wireless communication networks(WCNs).We propose a novel jamming channel allocation and power decision-making(JCAPD)approach based on multi-agent deep reinforcement learning(MADRL).In high-dynamic and multi-target aviation communication environments,the rapid changes in channels make it difficult for sensors to accurately capture instantaneous channel state information.This poses a challenge to make centralized jamming decisions with single-agent deep reinforcement learning(DRL)approaches.In response,we design a distributed multi-agent decision architecture(DMADA).We formulate multi-jammer resource allocation as a multiagent Markov decision process(MDP)and propose a fingerprint-based double deep Q-Network(FBDDQN)algorithm for solving it.Each jammer functions as an agent that interacts with the environment in this framework.Through the design of a reasonable reward and training mechanism,our approach enables jammers to achieve distributed cooperation,significantly improving the jamming success rate while considering jamming power cost,and reducing the transmission rate of links.Our experimental results show the FBDDQN algorithm is superior to the baseline methods. 展开更多
关键词 jamming resource allocation JCAPD MADRL wireless communication countermeasure wireless communication networks
在线阅读 下载PDF
Time-Varying Formation Tracking Control of Heterogeneous Multi-Agent Systems With Intermittent Communications and Directed Switching Networks
6
作者 Yuhan Wang Zhuping Wang +1 位作者 Hao Zhang Huaicheng Yan 《IEEE/CAA Journal of Automatica Sinica》 2025年第1期294-296,共3页
Dear Editor,This letter is concerned with the problem of time-varying formation tracking for heterogeneous multi-agent systems(MASs) under directed switching networks. For this purpose, our first step is to present so... Dear Editor,This letter is concerned with the problem of time-varying formation tracking for heterogeneous multi-agent systems(MASs) under directed switching networks. For this purpose, our first step is to present some sufficient conditions for the exponential stability of a particular category of switched systems. 展开更多
关键词 switched systems time varying formation tracking directed switching networks heterogeneous multi agent systems intermittent communications exponential stability
在线阅读 下载PDF
STABILITY ANALYSIS AND COOPERATIVE CONTROL OF DISTRIBUTED MULTI-AGENT SYSTEM WITH SAMPLED COMMUNICATION 被引量:1
7
作者 姚克明 王小兰 +2 位作者 吴俊 陆宇平 罗德林 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2012年第4期373-378,共6页
The cooperative control and stability analysis problems for the multi-agent system with sampled com- munication are investigated. Distributed state feedback controllers are adopted for the cooperation of networked age... The cooperative control and stability analysis problems for the multi-agent system with sampled com- munication are investigated. Distributed state feedback controllers are adopted for the cooperation of networked agents. A theorem in the form of linear matrix inequalities(LMI) is derived to analyze the system stability. An- other theorem in the form of optimization problem subject to LMI constraints is proposed to design the controller, and then the algorithm is presented. The simulation results verify the validity and the effectiveness of the pro- posed approach. 展开更多
关键词 cooperative control distributed control multi-agent system system stability linear matrix inequality
在线阅读 下载PDF
Research on Vehicle Joint Radar Communication Resource Optimization Method Based on GNN-DRL
8
作者 Zeyu Chen Jian Sun +1 位作者 Zhengda Huan Ziyi Zhang 《Computers, Materials & Continua》 2026年第2期1430-1446,共17页
To address the issues of poor adaptability in resource allocation and low multi-agent cooperation efficiency in Joint Radar and Communication(JRC)systems under dynamic environments,an intelligent optimization framewor... To address the issues of poor adaptability in resource allocation and low multi-agent cooperation efficiency in Joint Radar and Communication(JRC)systems under dynamic environments,an intelligent optimization framework integrating Deep Reinforcement Learning(DRL)and Graph Neural Network(GNN)is proposed.This framework models resource allocation as a Partially Observable Markov Game(POMG),designs a weighted reward function to balance radar and communication efficiencies,adopts the Multi-Agent Proximal Policy Optimization(MAPPO)framework,and integrates Graph Convolutional Networks(GCN)and Graph Sample and Aggregate(Graph-SAGE)to optimize information interaction.Simulations show that,compared with traditional methods and pure DRL methods,the proposed framework achieves improvements in performance metrics such as communication success rate,Average Age of Information(AoI),and policy convergence speed,effectively enabling resource management in complex environments.Moreover,the proposed GNN-DRL-based intelligent optimization framework obtains significantly better performance for resource management in multi-agent JRC systems than traditional methods and pure DRL methods. 展开更多
关键词 Graph neural network joint radar and communication resource allocation multi-agent collaboration
在线阅读 下载PDF
国际期刊开放同行评议实施进展及效果研究——以《Nature Communications》为例
9
作者 刘楚菲 盛怡瑾 《编辑学报》 北大核心 2026年第1期111-118,共8页
开放同行评议(open peer review,OPR)正在逐渐成为国际学术出版的重要实践方向,为了更加量化和直观地掌握开放同行评议的实践情况,本研究对《Nature Communications》近10年的OPR数据进行分析,系统考察该刊OPR模式的实施现状及效果。研... 开放同行评议(open peer review,OPR)正在逐渐成为国际学术出版的重要实践方向,为了更加量化和直观地掌握开放同行评议的实践情况,本研究对《Nature Communications》近10年的OPR数据进行分析,系统考察该刊OPR模式的实施现状及效果。研究表明,该刊OPR已由早期探索性试点逐步发展为常态化编辑政策,其中公开审稿报告的实施成效最为显著,占总论文数的74.92%;其次为公开审稿人姓名,占50.93%;“双公开”(同时公开审稿报告与审稿人姓名)占41.60%。在学术影响力方面,公开审稿报告与论文学科规范引文影响力(Category Normalized Citation Impact,CNCI)整体呈显著负相关,而公开审稿人姓名与CNCI之间未呈现显著关联。在科研诚信方面,撤稿论文的核心问题多数已在评审阶段被审稿人识别并提出,凸显了OPR在问题识别与风险预警中的重要价值。 展开更多
关键词 开放同行评议 实施进展 学术影响力 科研诚信 《Nature communications》
原文传递
To the Communications Community——2026 New Year's Message
10
作者 Zhang Ping 《ZTE Communications》 2026年第1期1-1,共1页
At this historic juncture of deepening technological revolution and industrial transformation,China's communication sector stands on the eve of another great leap forward.Reflecting on the development of communica... At this historic juncture of deepening technological revolution and industrial transformation,China's communication sector stands on the eve of another great leap forward.Reflecting on the development of communications over the past two decades,China has forged an innovative path from catching up to keeping pace and then to leading the way.Today,at the new starting point of 6G development and facing the paradigm shift brought about by“AI+communications,”China's scientific research community,with the courage to venture into uncharted territory,is advancing original theories such as the new communication paradigm based on a unified theoretical framework of information theory to the global forefront. 展开更多
关键词 G communicationS technological revolution AI industrial transformationchinas industrial transformation paradigm shift development communications
在线阅读 下载PDF
Energy Efficient Covert Communication in a Direct Uplink Satellite-Ground Communication Scenario
11
作者 Fu Shu Zeng Wen +1 位作者 Yin Liuguo Zhao Lian 《China Communications》 2026年第1期166-174,共9页
Efficient energy utilization in covert communication sustains covertness while assuring communication quality and efficiency.This paper investigates covert communication energy efficiency(EE)in direct uplink satellite... Efficient energy utilization in covert communication sustains covertness while assuring communication quality and efficiency.This paper investigates covert communication energy efficiency(EE)in direct uplink satellite-ground communications,focusing on enhancing system EE via optimized transmit beamforming and satellite orbit altitude selection.This paper first establishes an optimization problem to maximize system EE in a direct uplink satelliteground covert communication scenario.To solve this non-convex optimization problem,it is decomposed into two subproblems and solved using the successive convex approximation(SCA)method.Based on the above methods,this paper proposes an overall iterative optimization algorithm.Simulation results demonstrate that the proposed algorithm surpasses the conventional baseline algorithms in terms of system EE.Furthermore,they elucidate the correlation between the amount of information received by the receiver and the variations in the satellite’s orbital altitude. 展开更多
关键词 covert communication direct uplink satellite-ground communication energy efficiency
在线阅读 下载PDF
GRA:Graph-based reward aggregation for cooperative multi-agent reinforcement learning
12
作者 Jingcheng Tang Peng Zhou +1 位作者 He Bai Gangshan Jing 《Journal of Automation and Intelligence》 2026年第1期46-56,共11页
Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused ... Multi-agent reinforcement learning(MARL)has proven its effectiveness in cooperative multi-agent systems(MASs)but still faces issues on the curse of dimensionality and learning efficiency.The main difficulty is caused by the strong inter-agent coupling nature embedded in an MARL problem,which is yet to be fully exploited in existing algorithms.In this work,we recognize a learning graph characterizing the dependence between individual rewards and individual policies.Then we propose a graph-based reward aggregation(GRA)method,which utilizes the inherent coupling relationship among agents to eliminate redundant information.Specifically,GRA passes information among cooperating agents through graph attention networks to obtain aggregated rewards that contribute to the fitting of the value function,making each agent learn a decentralized executable cooperation policy.In addition,we propose a variant of GRA,named GRA-decen,which achieves decentralized training and decentralized execution(DTDE)when each agent only has access to information of partial agents in the learning process.We conduct experiments in different environments and demonstrate the practicality and scalability of our algorithms. 展开更多
关键词 Networked system multi-agent reinforcement learning Graph-based RL
在线阅读 下载PDF
Fixed-Time Zeroing Neural Dynamics for Adaptive Coordination of Multi-Agent Systems
13
作者 Cheng Hua Xinwei Cao +1 位作者 Jianfeng Li Shuai Li 《CAAI Transactions on Intelligence Technology》 2026年第1期267-278,共12页
This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination me... This paper presents an adaptive multi-agent coordination(AMAC)strategy suitable for complex scenarios,which only requires information exchange between neighbouring robots.Unlike traditional multi-agent coordination methods that are solved by neural dynamics,the proposed strategy displays greater flexibility,adaptability and scalability.Furthermore,the proposed AMAC strategy is reconstructed as a time-varying complex-valued matrix equation.By introducing a dynamic error function,a fixed-time convergent zeroing neural network(FTCZNN)model is designed for the online solution of the AMAC strategy,with its convergence time upper bound derived theoretically.Finally,the effectiveness and applicability of the coordination control method are demonstrated by numerical simulations and physical experiments.Numerical results indicate that this method can reduce the formation error to the order of 10^(-6)within 1.8 s. 展开更多
关键词 fixed-time convergence multi-agent coordination ROBOTICS zeroing neural dynamics
在线阅读 下载PDF
Output feedback prescribed performance state synchronization for leader-following high-order uncertain nonlinear multi-agent systems
14
作者 Ilias Katsoukis George A.Rovithakis 《Journal of Automation and Intelligence》 2026年第1期35-45,共11页
This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to es... This paper addresses the synchronization of follower agents’state vectors with that of a leader in high-order nonlinear multi-agent systems.The proposed low-complexity control scheme employs high-gain observers to estimate higher-order synchronization errors,enabling the controller to rely solely on relative output measurements.This approach significantly reduces the dependence on full-state information,which is often infeasible or costly in practical engineering applications.An output feedback control strategy is developed to overcome these limitations while ensuring robust and effective synchronization.Simulation results are provided to demonstrate the effectiveness of the proposed approach and validate the theoretical findings. 展开更多
关键词 Synchronization problem Leader-following High-order nonlinear systems multi-agent systems High-gain observer
在线阅读 下载PDF
Enhancing Disaster Response with IoFT:An Adaptive Communication Model for UAV-Based Surveillance
15
作者 A.F.M.Suaib Akhter 《Computer Modeling in Engineering & Sciences》 2026年第2期893-921,共29页
The modern world remains vulnerable to natural disasters,including floods,earthquakes,wildfires,and others.These events remain unpredictable and inevitable,and recovering quickly and effectively requires significant e... The modern world remains vulnerable to natural disasters,including floods,earthquakes,wildfires,and others.These events remain unpredictable and inevitable,and recovering quickly and effectively requires significant effort and expense.Monitoring is becoming more efficient thanks to technologies such as Unmanned Aerial Vehicles(UAVs),which can access hard-to-reach areas and provide real-time data.However,in disaster-affected areas,these monitoring systems may encounter many obstacles when communicating with servers or transmitting monitored data.This paper proposes an adaptive communication model to overcome the challenges faced in disaster-affected areas.A base station is responsible for collecting data(such as images and videos)captured by UAVs performing surveillance within its communication range.This station is typically a tower providing fixed cellular network service.However,in the absence of such a tower,a selected UAV may serve as the station,depending on the situation.If surveillance needs to be performed outside the coverage area,it can continue to communicate via nearby UAVs through cooperative communication.UAVs with internet support,known as the Internet of Flying Things(IoFT),will also be utilized to enhance communication capacity and efficiency.The proposed communication model is validated through experiments,showing superior data transmission performance and higher throughput.Analysis indicates it outperforms traditional systems,even in rural areas,with or without internet access. 展开更多
关键词 UAV communication IoFT natural disaster IOT
在线阅读 下载PDF
Distributed unsupervised meta-learning algorithm over multi-agent systems
16
作者 Zhenzhen Wang Bing He +3 位作者 Zixin Jiang Xianyang Zhang Haidi Dong Di Ye 《Digital Communications and Networks》 2026年第1期134-142,共9页
Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and colla... Multi-Agent Systems(MAS),which consist of multiple interacting agents,are crucial in Cyber-Physical Systems(CPS),because they improve system adaptability,efficiency,and robustness through parallel processing and collaboration.However,most existing unsupervised meta-learning methods are centralized and not suitable for multi-agent systems where data are distributed stored and inaccessible to all agents.Meta-GMVAE,based on Variational Autoencoder(VAE)and set-level variational inference,represents a sophisticated unsupervised meta-learning model that improves generative performance by efficiently learning data representations across various tasks,increasing adaptability and reducing sample requirements.Inspired by these advancements,we propose a novel Distributed Unsupervised Meta-Learning(DUML)framework based on Meta-GMVAE and a fusion strategy.Furthermore,we present a DUML algorithm based on Gaussian Mixture Model(DUMLGMM),where the parameters of the Gaussian-mixture are solved by an Expectation-Maximization algorithm.Simulations on Omniglot and Mini Image Net datasets show that DUMLGMM can achieve the performance of the corresponding centralized algorithm and outperform non-cooperative algorithm. 展开更多
关键词 Unsupervised meta-learning multi-agent systems Variational autoencoder Gaussian mixture model
在线阅读 下载PDF
Leader-following positive consensus of heterogeneous switched multi-agent systems with average dwell time switching
17
作者 Kaiming Li Wei Xing +1 位作者 Haoyue Yang Junfeng Zhang 《Control Theory and Technology》 2026年第1期66-81,共16页
This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consen... This paper focuses on the leader-following positive consensus problems of heterogeneous switched multi-agent systems.First,a state-feedback controller with dynamic compensation is introduced to achieve positive consensus under average dwell time switching.Then sufficient conditions are derived to guarantee the positive consensus.The gain matrices of the control protocol are described using a matrix decomposition approach and the corresponding computational complexity is reduced by resorting to linear programming and co-positive Lyapunov functions.Finally,two numerical examples are provided to illustrate the results obtained. 展开更多
关键词 Heterogeneous switched multi-agent systems Positive consensus Linear programming
原文传递
Toward Collaborative and Adaptive Learning:A Survey of Multi-agent Reinforcement Learning in Education
18
作者 Sirine Bouguettaya Ouarda Zedadra +1 位作者 Francesco Pupo Giancarlo Fortino 《Artificial Intelligence Science and Engineering》 2026年第1期1-19,共19页
In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Mu... In recent years,researchers have leveraged single-agent reinforcement learning to boost educational outcomes and deliver personalized interventions;yet this paradigm provides no capacity for inter-agent interaction.Multi-agent reinforcement learning(MARL)overcomes this limitation by allowing several agents to learn simultaneously within a shared environment,each choosing actions that maximize its own or the group's rewards.By explicitly modeling and exploiting agent-to-agent dynamics,MARL can align those interactions with pedagogical goals such as peer tutoring,collaborative problem-solving,or gamified competition,thus opening richer avenues for adaptive and socially informed learning experiences.This survey investigates the impact of MARL on educational outcomes by examining evidence of its effectiveness in enhancing learner performance,engagement,equity,and reducing teacher workload compared to single agent or traditional approaches.It explores the educational domains and pedagogical problems addressed by MARL,identifies the algorithmic families used,and analyzes their influence on learning.The review also assesses experimental settings and evaluation metrics to determine ecological validity,and outlines current challenges and future research directions in applying MARL to education. 展开更多
关键词 reinforcement learning multi-agent reinforcement learning Agentic AI EDUCATION generative AI
在线阅读 下载PDF
Research on UAV-MEC Cooperative Scheduling Algorithms Based on Multi-Agent Deep Reinforcement Learning
19
作者 Yonghua Huo Ying Liu +1 位作者 Anni Jiang Yang Yang 《Computers, Materials & Continua》 2026年第3期1823-1850,共28页
With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier... With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics. 展开更多
关键词 UAV-MEC networks multi-agent deep reinforcement learning MATD3 task offloading
在线阅读 下载PDF
Hierarchical Demand Response Considering Dynamic Competing Interaction Based on Multi-agent Deep Deterministic Policy Gradient
20
作者 Wenhao Wang Jiehui Zheng +3 位作者 Zhaoxi Liu Jiakun Fang Zhigang Li Q.H.Wu 《CSEE Journal of Power and Energy Systems》 2026年第1期162-174,共13页
To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on m... To maximize the profits of power grid operators(GOs),load aggregators(LAs)and electricity customers(ECs),this paper proposes a hierarchical demand response(HDR)framework that considers competing interaction based on multiagent deep deterministic policy gradient(MaDDPG).The ECs are divided into conventional ECs and the electric vehicles(EVs)which are managed by ECs agent(ECA)and EV agent(EVA)to exploit the flexibility of the HDR framework.Thus,the HDR is a tri-layer model determined by five types of agents engaging in competing interaction to maximize their own profits.To address the limitations of mathematical expression and participation scale in the Stackelberg game within the HDR model,a dynamic interaction mechanism is adopted.Moreover,to tackle the HDR involving various entities,the MaDDPG develops multiple agents to simulation the dynamic competing interactions between each subject as well as solve the problem of continuous action control.Furthermore,MaDDPG adopts soft target update and priority experience replay method to ensure stable and effective training,and makes the exploration strategy comprehensive by using exploration noise.Simulation studies are conducted to verify the performance of the MaDDPG with dynamic interaction mechanism in dealing with multilayer multi-agent continuous action control,compared to the double deep Q network(DDQN),deep Q network(DQN)and dueling DQN.Additionally,comparisons among the proposed HDR with the price based DR(PBDR)and incentive based DR(IBDR)are analyzed to investigate the flexibility of the HDR. 展开更多
关键词 Continuous action control deep reinforcement learning demand response dynamic interaction mechanism multi-agent
原文传递
上一页 1 2 250 下一页 到第
使用帮助 返回顶部