As human exploration of the ocean expands,the demand for continuous,high-quality,and ubiquitous maritime communication is steadily increasing.However,the dynamic nature of the marine environment and resource constrain...As human exploration of the ocean expands,the demand for continuous,high-quality,and ubiquitous maritime communication is steadily increasing.However,the dynamic nature of the marine environment and resource constraints present significant challenges for traditional heuristic resource allocation methods,complicating the balance between high-quality communication and limited network resources.This results in suboptimal system throughput and an over-reliance on specific problem structures.To address these issues,in this paper,we introduce a joint resource allocation method based on knowledge embedding.The proposed approach includes an action distribution alignment module designed to improve resource utilization by preventing unreasonable action-output combinations.Furthermore,by integrating knowledge embedding with meta-reinforcement learning techniques,a physical guidance loss function is formulated,which effectively reduces the sample size required for model training,thereby enhancing the algorithm's generalization capabilities.Simulation results show that the proposed method achieves an increase in average system throughput of 31.19%compared to the model-agnostic meta-learning proximal policy optimization(MAML-PPO)algorithm and 80.91%compared to the RL~2 algorithm,across various channel environments.展开更多
文摘As human exploration of the ocean expands,the demand for continuous,high-quality,and ubiquitous maritime communication is steadily increasing.However,the dynamic nature of the marine environment and resource constraints present significant challenges for traditional heuristic resource allocation methods,complicating the balance between high-quality communication and limited network resources.This results in suboptimal system throughput and an over-reliance on specific problem structures.To address these issues,in this paper,we introduce a joint resource allocation method based on knowledge embedding.The proposed approach includes an action distribution alignment module designed to improve resource utilization by preventing unreasonable action-output combinations.Furthermore,by integrating knowledge embedding with meta-reinforcement learning techniques,a physical guidance loss function is formulated,which effectively reduces the sample size required for model training,thereby enhancing the algorithm's generalization capabilities.Simulation results show that the proposed method achieves an increase in average system throughput of 31.19%compared to the model-agnostic meta-learning proximal policy optimization(MAML-PPO)algorithm and 80.91%compared to the RL~2 algorithm,across various channel environments.