In the complex and variable deep-sea environment,the compensation control of ship motion ensures the safety and efficiency of equipment installation and transportation in offshore wind farms.However,the ship motion po...In the complex and variable deep-sea environment,the compensation control of ship motion ensures the safety and efficiency of equipment installation and transportation in offshore wind farms.However,the ship motion posture compensation control system is severely affected by uncertainties,which significantly impact the accuracy of compensation control.In this paper,we propose a ship three-degree-of-freedom(3-DoF)motion posture stabilization control method based on the DTW-LSTM-MATD3 algorithm.We use the multi-agent twin delayed deep deterministic policy gradient(MATD3)to control a platform with six electric cylinders to achieve stable control.However,owing to random noise affecting the ship’s motion posture,we use a dynamic time warping(DTW)algorithm to distinguish between high-frequency noise and low-frequency tracking signals.Further,we embed a long short-term memory(LSTM)network into the MATD3 network to better align the Critic network’s training with the true Q-value.We use a combined reward function to enhance the agent’s exploration capability in complex dynamic environments.Finally,verification was conducted under sixth-level,abrupt sea conditions with high-frequency noise,as well as under real abrupt sea conditions,and a generalization test was also carried out.Simulation results show that the proposed DTW-LSTM-MATD3 method has great compensation control ability.展开更多
With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier...With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics.展开更多
基金supported by the National Natural Science Foundation of China(No.52105466).
文摘In the complex and variable deep-sea environment,the compensation control of ship motion ensures the safety and efficiency of equipment installation and transportation in offshore wind farms.However,the ship motion posture compensation control system is severely affected by uncertainties,which significantly impact the accuracy of compensation control.In this paper,we propose a ship three-degree-of-freedom(3-DoF)motion posture stabilization control method based on the DTW-LSTM-MATD3 algorithm.We use the multi-agent twin delayed deep deterministic policy gradient(MATD3)to control a platform with six electric cylinders to achieve stable control.However,owing to random noise affecting the ship’s motion posture,we use a dynamic time warping(DTW)algorithm to distinguish between high-frequency noise and low-frequency tracking signals.Further,we embed a long short-term memory(LSTM)network into the MATD3 network to better align the Critic network’s training with the true Q-value.We use a combined reward function to enhance the agent’s exploration capability in complex dynamic environments.Finally,verification was conducted under sixth-level,abrupt sea conditions with high-frequency noise,as well as under real abrupt sea conditions,and a generalization test was also carried out.Simulation results show that the proposed DTW-LSTM-MATD3 method has great compensation control ability.
文摘With the advent of sixth-generation mobile communications(6G),space-air-ground integrated networks have become mainstream.This paper focuses on collaborative scheduling for mobile edge computing(MEC)under a three-tier heterogeneous architecture composed of mobile devices,unmanned aerial vehicles(UAVs),and macro base stations(BSs).This scenario typically faces fast channel fading,dynamic computational loads,and energy constraints,whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings.To address this issue,we formulate a multi-agent Markov decision process(MDP)for an air-ground-fused MEC system,unify link selection,bandwidth/power allocation,and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm.The improvements include Alternating Layer Normalization(ALN)in the actor to suppress gradient variance,Residual Orthogonalization(RO)in the critic to reduce the correlation between the twin Q-value estimates,and a dynamic-temperature reward to enable adaptive trade-offs during training.On a multi-user,dual-link simulation platform,we conduct ablation and baseline comparisons.The results reveal that the proposed method has better convergence and stability.Compared with MADDPG,TD3,and DSAC,our algorithm achieves more robust performance across key metrics.