In this paper, the problems of target tracking and obstacle avoidance for multi-agent networks with input constraints are investigated. When there is a moving obstacle, the control objectives are to make the agents tr...In this paper, the problems of target tracking and obstacle avoidance for multi-agent networks with input constraints are investigated. When there is a moving obstacle, the control objectives are to make the agents track a moving target and to avoid collisions among agents. First, without considering the input constraints, a novel distributed controller can be obtained based on the potential function. Second, at each sampling time, the control algorithm is optimized. Furthermore, to solve the problem that agents cannot effectively avoid the obstacles in dynamic environment where the obstacles are moving, a new velocity repulsive potential is designed. One advantage of the designed control algorithm is that each agent only requires local knowledge of its neighboring agents. Finally, simulation results are provided to verify the effectiveness of the proposed approach.展开更多
To investigate the leader-following formation control, in this paper we present the design problem of control protocols and distributed observers under which the agents can achieve and maintain the desired formation f...To investigate the leader-following formation control, in this paper we present the design problem of control protocols and distributed observers under which the agents can achieve and maintain the desired formation from any initial states, while the velocity converges to that of the virtual leader whose velocity cannot be measured by agents in real time. The two cases of switching topologies without communication delay and fixed topology with time-varying communication delay are both considered for multi-agent networks. By using the Lyapunov stability theory, the issue of stability is analysed for multi-agent systems with switching topologies. Then, by considering the time-varying communication delay, the sufficient condition is proposed for the multi-agent systems with fixed topology. Finally, two numerical examples are given to illustrate the effectiveness of the proposed leader-following formation control protocols.展开更多
This paper investigates the consensus disturbance rejection problem among multiple high-order agents with directed graphs.Based on disturbance observers,distributed consensus disturbance rejection protocols are constr...This paper investigates the consensus disturbance rejection problem among multiple high-order agents with directed graphs.Based on disturbance observers,distributed consensus disturbance rejection protocols are constructed in leaderless and leader-follower consensus setups.Different from the previous related papers,the consensus protocols in this paper are developed in a fully distributed fashion,relying on only the state information of each agent and its neighbors.Sufficient conditions are provided to guarantee that the asymptotic stability of high-order multi-agent systems can be reached with matched disturbances.展开更多
In this paper the pinning consensus of multi-agent networks with arbitrary topology is investigated. Based on the properties of M-matrix, some criteria of pinning consensus are established for the continuous multi-age...In this paper the pinning consensus of multi-agent networks with arbitrary topology is investigated. Based on the properties of M-matrix, some criteria of pinning consensus are established for the continuous multi-agent network and the results show that the pinning consensus of the dynamical system depends on the smallest real part of the eigenvalue of the matrix which is composed of the Laplacian matrix of the multi-agent network and the pinning control gains. Meanwhile, the relevant work for the discrete-time system is studied and the corresponding criterion is also obtained. Particularly, the fundamental problem of pinning consensus, that is, what kind of node should be pinned, is investigated and the positive answers to this question are presented. Finally, the correctness of our theoretical findings is demonstrated by some numerical simulated examples.展开更多
This paper considers the formation tracking problem under a rigidity framework, where the target formation is specified as a minimally and infinitesimally rigid formation and the desired velocity of the group is avail...This paper considers the formation tracking problem under a rigidity framework, where the target formation is specified as a minimally and infinitesimally rigid formation and the desired velocity of the group is available to only a subset of the agents. The following two cases are considered: the desired velocity is constant, and the desired velocity is timevarying. In the first case, a distributed linear estimator is constructed for each agent to estimate the desired velocity. The velocity estimation and a formation acquisition term are employed to design the control inputs for the agents, where the rigidity matrix plays a central role. In the second case, a distributed non-smooth estimator is constructed to estimate the time-varying velocity, which is shown to converge in a finite time. Theoretical analysis shows that the formation tracking problem can be solved under the proposed control algorithms and estimators. Simulation results are also provided to show the validity of the derived results.展开更多
Multi-agent systems often require good interoperability in the process of completing their assigned tasks.This paper first models the static structure and dynamic behavior of multiagent systems based on layered weight...Multi-agent systems often require good interoperability in the process of completing their assigned tasks.This paper first models the static structure and dynamic behavior of multiagent systems based on layered weighted scale-free community network and susceptible-infected-recovered(SIR)model.To solve the problem of difficulty in describing the changes in the structure and collaboration mode of the system under external factors,a two-dimensional Monte Carlo method and an improved dynamic Bayesian network are used to simulate the impact of external environmental factors on multi-agent systems.A collaborative information flow path optimization algorithm for agents under environmental factors is designed based on the Dijkstra algorithm.A method for evaluating system interoperability is designed based on simulation experiments,providing reference for the construction planning and optimization of organizational application of the system.Finally,the feasibility of the method is verified through case studies.展开更多
In the islanded operation of distribution networks,due to the mismatch of line impedance at the inverter output,conventional droop control leads to inaccurate power sharing according to capacity,resulting in voltage a...In the islanded operation of distribution networks,due to the mismatch of line impedance at the inverter output,conventional droop control leads to inaccurate power sharing according to capacity,resulting in voltage and frequency fluctuations under minor external disturbances.To address this issue,this paper introduces an enhanced scheme for power sharing and voltage-frequency control.First,to solve the power distribution problem,we propose an adaptive virtual impedance control based on multi-agent consensus,which allows for precise active and reactive power allocation without requiring feeder impedance knowledge.Moreover,a novel consensus-based voltage and frequency control is proposed to correct the voltage deviation inherent in droop control and virtual impedance methods.This strategy maintains voltage and frequency stability even during communication disruptions and enhances system robustness.Additionally,a small-signal model is established for system stability analysis,and the control parameters are optimized.Simulation results validate the effectiveness of the proposed control scheme.展开更多
This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards grea...This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards greater decentralization and renewable integration,traditional optimization methods struggle to address the inherent complexities and uncertainties.Our proposed MARL framework enables adaptive,decentralized decision-making for both the distribution system operator and individual VPPs,optimizing economic efficiency while maintaining grid stability.We formulate the problem as a Markov decision process and develop a custom MARL algorithm that leverages actor-critic architectures and experience replay.Extensive simulations across diverse scenarios demonstrate that our approach consistently outperforms baseline methods,including Stackelberg game models and model predictive control,achieving an 18.73%reduction in costs and a 22.46%increase in VPP profits.The MARL framework shows particular strength in scenarios with high renewable energy penetration,where it improves system performance by 11.95%compared with traditional methods.Furthermore,our approach demonstrates superior adaptability to unexpected events and mis-predictions,highlighting its potential for real-world implementation.展开更多
This paper studies the problem of jamming decision-making for dynamic multiple communication links in wireless communication networks(WCNs).We propose a novel jamming channel allocation and power decision-making(JCAPD...This paper studies the problem of jamming decision-making for dynamic multiple communication links in wireless communication networks(WCNs).We propose a novel jamming channel allocation and power decision-making(JCAPD)approach based on multi-agent deep reinforcement learning(MADRL).In high-dynamic and multi-target aviation communication environments,the rapid changes in channels make it difficult for sensors to accurately capture instantaneous channel state information.This poses a challenge to make centralized jamming decisions with single-agent deep reinforcement learning(DRL)approaches.In response,we design a distributed multi-agent decision architecture(DMADA).We formulate multi-jammer resource allocation as a multiagent Markov decision process(MDP)and propose a fingerprint-based double deep Q-Network(FBDDQN)algorithm for solving it.Each jammer functions as an agent that interacts with the environment in this framework.Through the design of a reasonable reward and training mechanism,our approach enables jammers to achieve distributed cooperation,significantly improving the jamming success rate while considering jamming power cost,and reducing the transmission rate of links.Our experimental results show the FBDDQN algorithm is superior to the baseline methods.展开更多
Unmanned Aerial Vehicles(UAVs)have demonstrated significant potential as Aerial Base Stations(A-BSs)for providing data services to Ground Users(GUs),attributed to their flexibility,cost-effectiveness,and high likeliho...Unmanned Aerial Vehicles(UAVs)have demonstrated significant potential as Aerial Base Stations(A-BSs)for providing data services to Ground Users(GUs),attributed to their flexibility,cost-effectiveness,and high likelihood of establishing line-of-sight links.In this article,we formulate the joint power and trajectory optimization problem for a multi-UAV assisted wireless network with no-fly zones constrained,aiming at maximizing the Accumulated Service Data(ASD)of UAVs and minimizing the Average End Age of Information(AEAoI)of GUs.Specifically,this paper proposes the Multi-Agent worst-case Soft Actor Critic(MA-wcSAC)algorithm with a distributional safety-critic.The simulation results demonstrate that,compared to the Multi-Agent Soft Actor Critic(MA-SAC)algorithm,the proposed algorithm exhibits comparable data service performance while reducing security risks by at least 30%at different risk levels.展开更多
Dear Editor,This letter is concerned with the problem of time-varying formation tracking for heterogeneous multi-agent systems(MASs) under directed switching networks. For this purpose, our first step is to present so...Dear Editor,This letter is concerned with the problem of time-varying formation tracking for heterogeneous multi-agent systems(MASs) under directed switching networks. For this purpose, our first step is to present some sufficient conditions for the exponential stability of a particular category of switched systems.展开更多
This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method...This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system.展开更多
Cybertwin-enabled 6th Generation(6G)network is envisioned to support artificial intelligence-native management to meet changing demands of 6G applications.Multi-Agent Deep Reinforcement Learning(MADRL)technologies dri...Cybertwin-enabled 6th Generation(6G)network is envisioned to support artificial intelligence-native management to meet changing demands of 6G applications.Multi-Agent Deep Reinforcement Learning(MADRL)technologies driven by Cybertwins have been proposed for adaptive task offloading strategies.However,the existence of random transmission delay between Cybertwin-driven agents and underlying networks is not considered in related works,which destroys the standard Markov property and increases the decision reaction time to reduce the task offloading strategy performance.In order to address this problem,we propose a pipelining task offloading method to lower the decision reaction time and model it as a delay-aware Markov Decision Process(MDP).Then,we design a delay-aware MADRL algorithm to minimize the weighted sum of task execution latency and energy consumption.Firstly,the state space is augmented using the lastly-received state and historical actions to rebuild the Markov property.Secondly,Gate Transformer-XL is introduced to capture historical actions'importance and maintain the consistent input dimension dynamically changed due to random transmission delays.Thirdly,a sampling method and a new loss function with the difference between the current and target state value and the difference between real state-action value and augmented state-action value are designed to obtain state transition trajectories close to the real ones.Numerical results demonstrate that the proposed methods are effective in reducing reaction time and improving the task offloading performance in the random-delay Cybertwin-enabled 6G networks.展开更多
Aiming at the problem of mobile data traffic surge in 5G networks,this paper proposes an effective solution combining massive multiple-input multiple-output techniques with Ultra-Dense Network(UDN)and focuses on solvi...Aiming at the problem of mobile data traffic surge in 5G networks,this paper proposes an effective solution combining massive multiple-input multiple-output techniques with Ultra-Dense Network(UDN)and focuses on solving the resulting challenge of increased energy consumption.A base station control algorithm based on Multi-Agent Proximity Policy Optimization(MAPPO)is designed.In the constructed 5G UDN model,each base station is considered as an agent,and the MAPPO algorithm enables inter-base station collaboration and interference management to optimize the network performance.To reduce the extra power consumption due to frequent sleep mode switching of base stations,a sleep mode switching decision algorithm is proposed.The algorithm reduces unnecessary power consumption by evaluating the network state similarity and intelligently adjusting the agent’s action strategy.Simulation results show that the proposed algorithm reduces the power consumption by 24.61% compared to the no-sleep strategy and further reduces the power consumption by 5.36% compared to the traditional MAPPO algorithm under the premise of guaranteeing the quality of service of users.展开更多
This paper examines the difficulties of managing distributed power systems,notably due to the increasing use of renewable energy sources,and focuses on voltage control challenges exacerbated by their variable nature i...This paper examines the difficulties of managing distributed power systems,notably due to the increasing use of renewable energy sources,and focuses on voltage control challenges exacerbated by their variable nature in modern power grids.To tackle the unique challenges of voltage control in distributed renewable energy networks,researchers are increasingly turning towards multi-agent reinforcement learning(MARL).However,MARL raises safety concerns due to the unpredictability in agent actions during their exploration phase.This unpredictability can lead to unsafe control measures.To mitigate these safety concerns in MARL-based voltage control,our study introduces a novel approach:Safety-ConstrainedMulti-Agent Reinforcement Learning(SC-MARL).This approach incorporates a specialized safety constraint module specifically designed for voltage control within the MARL framework.This module ensures that the MARL agents carry out voltage control actions safely.The experiments demonstrate that,in the 33-buses,141-buses,and 322-buses power systems,employing SC-MARL for voltage control resulted in a reduction of the Voltage Out of Control Rate(%V.out)from0.43,0.24,and 2.95 to 0,0.01,and 0.03,respectively.Additionally,the Reactive Power Loss(Q loss)decreased from 0.095,0.547,and 0.017 to 0.062,0.452,and 0.016 in the corresponding systems.展开更多
Medical procedures are inherently invasive and carry the risk of inducing pain to the mind and body.Recently,efforts have been made to alleviate the discomfort associated with invasive medical procedures through the u...Medical procedures are inherently invasive and carry the risk of inducing pain to the mind and body.Recently,efforts have been made to alleviate the discomfort associated with invasive medical procedures through the use of virtual reality(VR)technology.VR has been demonstrated to be an effective treatment for pain associated with medical procedures,as well as for chronic pain conditions for which no effective treatment has been established.The precise mechanism by which the diversion from reality facilitated by VR contributes to the diminution of pain and anxiety has yet to be elucidated.However,the provision of positive images through VR-based visual stimulation may enhance the functionality of brain networks.The salience network is diminished,while the default mode network is enhanced.Additionally,the medial prefrontal cortex may establish a stronger connection with the default mode network,which could result in a reduction of pain and anxiety.Further research into the potential of VR technology to alleviate pain could lead to a reduction in the number of individuals who overdose on painkillers and contribute to positive change in the medical field.展开更多
Complex network models are frequently employed for simulating and studyingdiverse real-world complex systems.Among these models,scale-free networks typically exhibit greater fragility to malicious attacks.Consequently...Complex network models are frequently employed for simulating and studyingdiverse real-world complex systems.Among these models,scale-free networks typically exhibit greater fragility to malicious attacks.Consequently,enhancing the robustness of scale-free networks has become a pressing issue.To address this problem,this paper proposes a Multi-Granularity Integration Algorithm(MGIA),which aims to improve the robustness of scale-free networks while keeping the initial degree of each node unchanged,ensuring network connectivity and avoiding the generation of multiple edges.The algorithm generates a multi-granularity structure from the initial network to be optimized,then uses different optimization strategies to optimize the networks at various granular layers in this structure,and finally realizes the information exchange between different granular layers,thereby further enhancing the optimization effect.We propose new network refresh,crossover,and mutation operators to ensure that the optimized network satisfies the given constraints.Meanwhile,we propose new network similarity and network dissimilarity evaluation metrics to improve the effectiveness of the optimization operators in the algorithm.In the experiments,the MGIA enhances the robustness of the scale-free network by 67.6%.This improvement is approximately 17.2%higher than the optimization effects achieved by eight currently existing complex network robustness optimization algorithms.展开更多
Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in us...Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in user demand for latency-sensitive tasks has inevitably led to offloading bottlenecks and insufficient computational capacity on individual satellite edge servers,making it necessary to implement effective task offloading scheduling to enhance user experience.In this paper,we propose a priority-based task scheduling strategy based on a Software-Defined Network(SDN)framework for satellite-terrestrial integrated networks,which clarifies the execution order of tasks based on their priority.Subsequently,we apply a Dueling-Double Deep Q-Network(DDQN)algorithm enhanced with prioritized experience replay to derive a computation offloading strategy,improving the experience replay mechanism within the Dueling-DDQN framework.Next,we utilize the Deep Deterministic Policy Gradient(DDPG)algorithm to determine the optimal resource allocation strategy to reduce the processing latency of sub-tasks.Simulation results demonstrate that the proposed d3-DDPG algorithm outperforms other approaches,effectively reducing task processing latency and thus improving user experience and system efficiency.展开更多
基金supported by National Basic Research Program of China (973 Program) (No. 2010CB731800)Key Project of National Science Foundation of China (No. 60934003)+2 种基金National Nature Science Foundation of China (No. 61074065)Key Project for Natural Science Research of Hebei Education Department, PRC(No. ZD200908)Key Project for Shanghai Committee of Science and Technology (No. 08511501600)
文摘In this paper, the problems of target tracking and obstacle avoidance for multi-agent networks with input constraints are investigated. When there is a moving obstacle, the control objectives are to make the agents track a moving target and to avoid collisions among agents. First, without considering the input constraints, a novel distributed controller can be obtained based on the potential function. Second, at each sampling time, the control algorithm is optimized. Furthermore, to solve the problem that agents cannot effectively avoid the obstacles in dynamic environment where the obstacles are moving, a new velocity repulsive potential is designed. One advantage of the designed control algorithm is that each agent only requires local knowledge of its neighboring agents. Finally, simulation results are provided to verify the effectiveness of the proposed approach.
基金Project supported by the National Natural Science Foundation for Distinguished Young Scholars of China (Grant No. 60525303)the National Natural Science Foundation of China (Grant No. 60704009)+1 种基金the Key Project for Natural Science Research of the Hebei Educational Department (Grant No. ZD200908)the Doctorial Fund of Yanshan University (Grant No. B203)
文摘To investigate the leader-following formation control, in this paper we present the design problem of control protocols and distributed observers under which the agents can achieve and maintain the desired formation from any initial states, while the velocity converges to that of the virtual leader whose velocity cannot be measured by agents in real time. The two cases of switching topologies without communication delay and fixed topology with time-varying communication delay are both considered for multi-agent networks. By using the Lyapunov stability theory, the issue of stability is analysed for multi-agent systems with switching topologies. Then, by considering the time-varying communication delay, the sufficient condition is proposed for the multi-agent systems with fixed topology. Finally, two numerical examples are given to illustrate the effectiveness of the proposed leader-following formation control protocols.
基金supported by the National Natural Science Foundation of China (Nos. U1713223 and 61876187)by the Beijing Nova Program (No. 2018047)by the Joint Fund of Ministry of Education of China for Equipment Preresearch
文摘This paper investigates the consensus disturbance rejection problem among multiple high-order agents with directed graphs.Based on disturbance observers,distributed consensus disturbance rejection protocols are constructed in leaderless and leader-follower consensus setups.Different from the previous related papers,the consensus protocols in this paper are developed in a fully distributed fashion,relying on only the state information of each agent and its neighbors.Sufficient conditions are provided to guarantee that the asymptotic stability of high-order multi-agent systems can be reached with matched disturbances.
基金supported by the National Natural Science Foundation of China (Grant Nos. 60973114 and 61170249)the Natural Science Foundation of Chongqing Science and Technology Commission, China (Grant Nos. 2009BA2024, cstc2011jjA40045, and cstc2013jcyjA0906)the State Key Laboratory of Power Transmission Equipment & System Security and New Technology, Chongqing University, China (Grant No. 2007DA10512711206)
文摘In this paper the pinning consensus of multi-agent networks with arbitrary topology is investigated. Based on the properties of M-matrix, some criteria of pinning consensus are established for the continuous multi-agent network and the results show that the pinning consensus of the dynamical system depends on the smallest real part of the eigenvalue of the matrix which is composed of the Laplacian matrix of the multi-agent network and the pinning control gains. Meanwhile, the relevant work for the discrete-time system is studied and the corresponding criterion is also obtained. Particularly, the fundamental problem of pinning consensus, that is, what kind of node should be pinned, is investigated and the positive answers to this question are presented. Finally, the correctness of our theoretical findings is demonstrated by some numerical simulated examples.
基金Project supported by the National Natural Science Foundation of China(Grant No.61473240)
文摘This paper considers the formation tracking problem under a rigidity framework, where the target formation is specified as a minimally and infinitesimally rigid formation and the desired velocity of the group is available to only a subset of the agents. The following two cases are considered: the desired velocity is constant, and the desired velocity is timevarying. In the first case, a distributed linear estimator is constructed for each agent to estimate the desired velocity. The velocity estimation and a formation acquisition term are employed to design the control inputs for the agents, where the rigidity matrix plays a central role. In the second case, a distributed non-smooth estimator is constructed to estimate the time-varying velocity, which is shown to converge in a finite time. Theoretical analysis shows that the formation tracking problem can be solved under the proposed control algorithms and estimators. Simulation results are also provided to show the validity of the derived results.
基金supported by the Key R&D Projects in Jiangsu Province(BE2021729)the Key Primary Research Project of Primary Strengthening Program(KYZYJKKCJC23001).
文摘Multi-agent systems often require good interoperability in the process of completing their assigned tasks.This paper first models the static structure and dynamic behavior of multiagent systems based on layered weighted scale-free community network and susceptible-infected-recovered(SIR)model.To solve the problem of difficulty in describing the changes in the structure and collaboration mode of the system under external factors,a two-dimensional Monte Carlo method and an improved dynamic Bayesian network are used to simulate the impact of external environmental factors on multi-agent systems.A collaborative information flow path optimization algorithm for agents under environmental factors is designed based on the Dijkstra algorithm.A method for evaluating system interoperability is designed based on simulation experiments,providing reference for the construction planning and optimization of organizational application of the system.Finally,the feasibility of the method is verified through case studies.
基金supported by the National Natural Science Foundation of China(52007009)Natural Science Foundation of Excellent Youth Project of Hunan Province of China(2023JJ20039)Science and Technology Projects of State Grid Hunan Provincial Electric Power Co.,Ltd.(5216A522001K,SGHNDK00PWJS2310173).
文摘In the islanded operation of distribution networks,due to the mismatch of line impedance at the inverter output,conventional droop control leads to inaccurate power sharing according to capacity,resulting in voltage and frequency fluctuations under minor external disturbances.To address this issue,this paper introduces an enhanced scheme for power sharing and voltage-frequency control.First,to solve the power distribution problem,we propose an adaptive virtual impedance control based on multi-agent consensus,which allows for precise active and reactive power allocation without requiring feeder impedance knowledge.Moreover,a novel consensus-based voltage and frequency control is proposed to correct the voltage deviation inherent in droop control and virtual impedance methods.This strategy maintains voltage and frequency stability even during communication disruptions and enhances system robustness.Additionally,a small-signal model is established for system stability analysis,and the control parameters are optimized.Simulation results validate the effectiveness of the proposed control scheme.
基金supported by the Science and Technology Project of State Grid Sichuan Electric Power Company Chengdu Power Supply Company under Grant No.521904240005.
文摘This paper presents a novel approach to dynamic pricing and distributed energy management in virtual power plant(VPP)networks using multi-agent reinforcement learning(MARL).As the energy landscape evolves towards greater decentralization and renewable integration,traditional optimization methods struggle to address the inherent complexities and uncertainties.Our proposed MARL framework enables adaptive,decentralized decision-making for both the distribution system operator and individual VPPs,optimizing economic efficiency while maintaining grid stability.We formulate the problem as a Markov decision process and develop a custom MARL algorithm that leverages actor-critic architectures and experience replay.Extensive simulations across diverse scenarios demonstrate that our approach consistently outperforms baseline methods,including Stackelberg game models and model predictive control,achieving an 18.73%reduction in costs and a 22.46%increase in VPP profits.The MARL framework shows particular strength in scenarios with high renewable energy penetration,where it improves system performance by 11.95%compared with traditional methods.Furthermore,our approach demonstrates superior adaptability to unexpected events and mis-predictions,highlighting its potential for real-world implementation.
基金supported in part by the National Natural Science Foundation of China(No.61906156).
文摘This paper studies the problem of jamming decision-making for dynamic multiple communication links in wireless communication networks(WCNs).We propose a novel jamming channel allocation and power decision-making(JCAPD)approach based on multi-agent deep reinforcement learning(MADRL).In high-dynamic and multi-target aviation communication environments,the rapid changes in channels make it difficult for sensors to accurately capture instantaneous channel state information.This poses a challenge to make centralized jamming decisions with single-agent deep reinforcement learning(DRL)approaches.In response,we design a distributed multi-agent decision architecture(DMADA).We formulate multi-jammer resource allocation as a multiagent Markov decision process(MDP)and propose a fingerprint-based double deep Q-Network(FBDDQN)algorithm for solving it.Each jammer functions as an agent that interacts with the environment in this framework.Through the design of a reasonable reward and training mechanism,our approach enables jammers to achieve distributed cooperation,significantly improving the jamming success rate while considering jamming power cost,and reducing the transmission rate of links.Our experimental results show the FBDDQN algorithm is superior to the baseline methods.
基金supported in part by the National Natural Science Foundation of China(Nos.62371369 and 62376204)the National Key R&D Program of China(No.2022YFC3301300).
文摘Unmanned Aerial Vehicles(UAVs)have demonstrated significant potential as Aerial Base Stations(A-BSs)for providing data services to Ground Users(GUs),attributed to their flexibility,cost-effectiveness,and high likelihood of establishing line-of-sight links.In this article,we formulate the joint power and trajectory optimization problem for a multi-UAV assisted wireless network with no-fly zones constrained,aiming at maximizing the Accumulated Service Data(ASD)of UAVs and minimizing the Average End Age of Information(AEAoI)of GUs.Specifically,this paper proposes the Multi-Agent worst-case Soft Actor Critic(MA-wcSAC)algorithm with a distributional safety-critic.The simulation results demonstrate that,compared to the Multi-Agent Soft Actor Critic(MA-SAC)algorithm,the proposed algorithm exhibits comparable data service performance while reducing security risks by at least 30%at different risk levels.
基金supported in part by the National Natural Science Foundation of China(62273255,62350003,62088101)the Shanghai Science and Technology Cooperation Project(22510712000,21550760900)+1 种基金the Shanghai Municipal Science and Technology Major Project(2021SHZDZX0100)the Fundamental Research Funds for the Central Universities
文摘Dear Editor,This letter is concerned with the problem of time-varying formation tracking for heterogeneous multi-agent systems(MASs) under directed switching networks. For this purpose, our first step is to present some sufficient conditions for the exponential stability of a particular category of switched systems.
基金The National Natural Science Foundation of China(W2431048)The Science and Technology Research Program of Chongqing Municipal Education Commission,China(KJZDK202300807)The Chongqing Natural Science Foundation,China(CSTB2024NSCQQCXMX0052).
文摘This paper addresses the consensus problem of nonlinear multi-agent systems subject to external disturbances and uncertainties under denial-ofservice(DoS)attacks.Firstly,an observer-based state feedback control method is employed to achieve secure control by estimating the system's state in real time.Secondly,by combining a memory-based adaptive eventtriggered mechanism with neural networks,the paper aims to approximate the nonlinear terms in the networked system and efficiently conserve system resources.Finally,based on a two-degree-of-freedom model of a vehicle affected by crosswinds,this paper constructs a multi-unmanned ground vehicle(Multi-UGV)system to validate the effectiveness of the proposed method.Simulation results show that the proposed control strategy can effectively handle external disturbances such as crosswinds in practical applications,ensuring the stability and reliable operation of the Multi-UGV system.
基金funded by the National Key Research and Development Program of China under Grant 2019YFB1803301Beijing Natural Science Foundation (L202002)。
文摘Cybertwin-enabled 6th Generation(6G)network is envisioned to support artificial intelligence-native management to meet changing demands of 6G applications.Multi-Agent Deep Reinforcement Learning(MADRL)technologies driven by Cybertwins have been proposed for adaptive task offloading strategies.However,the existence of random transmission delay between Cybertwin-driven agents and underlying networks is not considered in related works,which destroys the standard Markov property and increases the decision reaction time to reduce the task offloading strategy performance.In order to address this problem,we propose a pipelining task offloading method to lower the decision reaction time and model it as a delay-aware Markov Decision Process(MDP).Then,we design a delay-aware MADRL algorithm to minimize the weighted sum of task execution latency and energy consumption.Firstly,the state space is augmented using the lastly-received state and historical actions to rebuild the Markov property.Secondly,Gate Transformer-XL is introduced to capture historical actions'importance and maintain the consistent input dimension dynamically changed due to random transmission delays.Thirdly,a sampling method and a new loss function with the difference between the current and target state value and the difference between real state-action value and augmented state-action value are designed to obtain state transition trajectories close to the real ones.Numerical results demonstrate that the proposed methods are effective in reducing reaction time and improving the task offloading performance in the random-delay Cybertwin-enabled 6G networks.
基金supported by National Natural Science Foundation of China(62271096,U20A20157)Natural Science Foundation of Chongqing,China(CSTB2023NSCQ-LZX0134)+3 种基金University Innovation Research Group of Chongqing(CXQT20017)Youth Innovation Group Support Program of ICE Discipline of CQUPT(SCIE-QN-2022-04)the Science and Technology Research Program of Chongqing Municipal Education Commission(KJQN202300632)the Chongqing Postdoctoral Special Funding Project(2022CQBSHTB2057).
文摘Aiming at the problem of mobile data traffic surge in 5G networks,this paper proposes an effective solution combining massive multiple-input multiple-output techniques with Ultra-Dense Network(UDN)and focuses on solving the resulting challenge of increased energy consumption.A base station control algorithm based on Multi-Agent Proximity Policy Optimization(MAPPO)is designed.In the constructed 5G UDN model,each base station is considered as an agent,and the MAPPO algorithm enables inter-base station collaboration and interference management to optimize the network performance.To reduce the extra power consumption due to frequent sleep mode switching of base stations,a sleep mode switching decision algorithm is proposed.The algorithm reduces unnecessary power consumption by evaluating the network state similarity and intelligently adjusting the agent’s action strategy.Simulation results show that the proposed algorithm reduces the power consumption by 24.61% compared to the no-sleep strategy and further reduces the power consumption by 5.36% compared to the traditional MAPPO algorithm under the premise of guaranteeing the quality of service of users.
基金“Regional Innovation Strategy(RIS)”through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(MOE)(2021RIS-002).
文摘This paper examines the difficulties of managing distributed power systems,notably due to the increasing use of renewable energy sources,and focuses on voltage control challenges exacerbated by their variable nature in modern power grids.To tackle the unique challenges of voltage control in distributed renewable energy networks,researchers are increasingly turning towards multi-agent reinforcement learning(MARL).However,MARL raises safety concerns due to the unpredictability in agent actions during their exploration phase.This unpredictability can lead to unsafe control measures.To mitigate these safety concerns in MARL-based voltage control,our study introduces a novel approach:Safety-ConstrainedMulti-Agent Reinforcement Learning(SC-MARL).This approach incorporates a specialized safety constraint module specifically designed for voltage control within the MARL framework.This module ensures that the MARL agents carry out voltage control actions safely.The experiments demonstrate that,in the 33-buses,141-buses,and 322-buses power systems,employing SC-MARL for voltage control resulted in a reduction of the Voltage Out of Control Rate(%V.out)from0.43,0.24,and 2.95 to 0,0.01,and 0.03,respectively.Additionally,the Reactive Power Loss(Q loss)decreased from 0.095,0.547,and 0.017 to 0.062,0.452,and 0.016 in the corresponding systems.
文摘Medical procedures are inherently invasive and carry the risk of inducing pain to the mind and body.Recently,efforts have been made to alleviate the discomfort associated with invasive medical procedures through the use of virtual reality(VR)technology.VR has been demonstrated to be an effective treatment for pain associated with medical procedures,as well as for chronic pain conditions for which no effective treatment has been established.The precise mechanism by which the diversion from reality facilitated by VR contributes to the diminution of pain and anxiety has yet to be elucidated.However,the provision of positive images through VR-based visual stimulation may enhance the functionality of brain networks.The salience network is diminished,while the default mode network is enhanced.Additionally,the medial prefrontal cortex may establish a stronger connection with the default mode network,which could result in a reduction of pain and anxiety.Further research into the potential of VR technology to alleviate pain could lead to a reduction in the number of individuals who overdose on painkillers and contribute to positive change in the medical field.
基金National Natural Science Foundation of China(11971211,12171388).
文摘Complex network models are frequently employed for simulating and studyingdiverse real-world complex systems.Among these models,scale-free networks typically exhibit greater fragility to malicious attacks.Consequently,enhancing the robustness of scale-free networks has become a pressing issue.To address this problem,this paper proposes a Multi-Granularity Integration Algorithm(MGIA),which aims to improve the robustness of scale-free networks while keeping the initial degree of each node unchanged,ensuring network connectivity and avoiding the generation of multiple edges.The algorithm generates a multi-granularity structure from the initial network to be optimized,then uses different optimization strategies to optimize the networks at various granular layers in this structure,and finally realizes the information exchange between different granular layers,thereby further enhancing the optimization effect.We propose new network refresh,crossover,and mutation operators to ensure that the optimized network satisfies the given constraints.Meanwhile,we propose new network similarity and network dissimilarity evaluation metrics to improve the effectiveness of the optimization operators in the algorithm.In the experiments,the MGIA enhances the robustness of the scale-free network by 67.6%.This improvement is approximately 17.2%higher than the optimization effects achieved by eight currently existing complex network robustness optimization algorithms.
文摘Satellite edge computing has garnered significant attention from researchers;however,processing a large volume of tasks within multi-node satellite networks still poses considerable challenges.The sharp increase in user demand for latency-sensitive tasks has inevitably led to offloading bottlenecks and insufficient computational capacity on individual satellite edge servers,making it necessary to implement effective task offloading scheduling to enhance user experience.In this paper,we propose a priority-based task scheduling strategy based on a Software-Defined Network(SDN)framework for satellite-terrestrial integrated networks,which clarifies the execution order of tasks based on their priority.Subsequently,we apply a Dueling-Double Deep Q-Network(DDQN)algorithm enhanced with prioritized experience replay to derive a computation offloading strategy,improving the experience replay mechanism within the Dueling-DDQN framework.Next,we utilize the Deep Deterministic Policy Gradient(DDPG)algorithm to determine the optimal resource allocation strategy to reduce the processing latency of sub-tasks.Simulation results demonstrate that the proposed d3-DDPG algorithm outperforms other approaches,effectively reducing task processing latency and thus improving user experience and system efficiency.