In this paper,we investigate the problem of fast spectrum sharing in vehicle-to-everything com-munication.In order to improve the spectrum effi-ciency of the whole system,the spectrum of vehicle-to-infrastructure link...In this paper,we investigate the problem of fast spectrum sharing in vehicle-to-everything com-munication.In order to improve the spectrum effi-ciency of the whole system,the spectrum of vehicle-to-infrastructure links is reused by vehicle-to-vehicle links.To this end,we model it as a problem of deep reinforcement learning and tackle it with prox-imal policy optimization.A considerable number of interactions are often required for training an agent with good performance,so simulation-based training is commonly used in communication networks.Nev-ertheless,severe performance degradation may occur when the agent is directly deployed in the real world,even though it can perform well on the simulator,due to the reality gap between the simulation and the real environments.To address this issue,we make prelim-inary efforts by proposing an algorithm based on meta reinforcement learning.This algorithm enables the agent to rapidly adapt to a new task with the knowl-edge extracted from similar tasks,leading to fewer in-teractions and less training time.Numerical results show that our method achieves near-optimal perfor-mance and exhibits rapid convergence.展开更多
Unmanned aerial vehicle(UAV)-assisted communications have been considered as a solution of aerial networking in future wireless networks due to its low-cost, high-mobility, and swift features. This paper considers a U...Unmanned aerial vehicle(UAV)-assisted communications have been considered as a solution of aerial networking in future wireless networks due to its low-cost, high-mobility, and swift features. This paper considers a UAV-assisted downlink transmission,where UAVs are deployed as aerial base stations to serve ground users. To maximize the average transmission rate among the ground users, this paper formulates a joint optimization problem of UAV trajectory design and channel selection, which is NP-hard and non-convex. To solve the problem, we propose a multi-agent deep Q-network(MADQN) scheme.Specifically, the agents that the UAVs act as perform actions from their observations distributively and share the same reward. To tackle the tasks where the experience is insufficient, we propose a multi-agent meta reinforcement learning algorithm to fast adapt to the new tasks. By pretraining the tasks with similar distribution, the learning model can acquire general knowledge. Simulation results have indicated the MADQN scheme can achieve higher throughput than fixed allocation. Furthermore, our proposed multiagent meta reinforcement learning algorithm learns the new tasks much faster compared with the MADQN scheme.展开更多
基金L.Liang was supported in part by the Natural Science Foundation of Jiangsu Province under Grant BK20220810in part by the National Natural Science Foundation of China under Grant 62201145 and Grant 62231019S.Jin was supported in part by the National Natural Science Foundation of China(NSFC)under Grants 62261160576,62341107,61921004。
文摘In this paper,we investigate the problem of fast spectrum sharing in vehicle-to-everything com-munication.In order to improve the spectrum effi-ciency of the whole system,the spectrum of vehicle-to-infrastructure links is reused by vehicle-to-vehicle links.To this end,we model it as a problem of deep reinforcement learning and tackle it with prox-imal policy optimization.A considerable number of interactions are often required for training an agent with good performance,so simulation-based training is commonly used in communication networks.Nev-ertheless,severe performance degradation may occur when the agent is directly deployed in the real world,even though it can perform well on the simulator,due to the reality gap between the simulation and the real environments.To address this issue,we make prelim-inary efforts by proposing an algorithm based on meta reinforcement learning.This algorithm enables the agent to rapidly adapt to a new task with the knowl-edge extracted from similar tasks,leading to fewer in-teractions and less training time.Numerical results show that our method achieves near-optimal perfor-mance and exhibits rapid convergence.
基金supported in part by the National Nature Science Foundation of China under Grant 62131005 and U19B2014in part by the National Key Research and Development Program of China under Grant 254。
文摘Unmanned aerial vehicle(UAV)-assisted communications have been considered as a solution of aerial networking in future wireless networks due to its low-cost, high-mobility, and swift features. This paper considers a UAV-assisted downlink transmission,where UAVs are deployed as aerial base stations to serve ground users. To maximize the average transmission rate among the ground users, this paper formulates a joint optimization problem of UAV trajectory design and channel selection, which is NP-hard and non-convex. To solve the problem, we propose a multi-agent deep Q-network(MADQN) scheme.Specifically, the agents that the UAVs act as perform actions from their observations distributively and share the same reward. To tackle the tasks where the experience is insufficient, we propose a multi-agent meta reinforcement learning algorithm to fast adapt to the new tasks. By pretraining the tasks with similar distribution, the learning model can acquire general knowledge. Simulation results have indicated the MADQN scheme can achieve higher throughput than fixed allocation. Furthermore, our proposed multiagent meta reinforcement learning algorithm learns the new tasks much faster compared with the MADQN scheme.