为解决基于深度强化学习的AUV跟踪控制器在面临新任务时需从零开始训练、训练速度慢、稳定性差等问题,设计一种基于元强化学习的AUV多任务快速自适应控制算法——R-SAC(Reptile-Soft Actor Critic)算法。R-SAC算法将元学习与强化学习相...为解决基于深度强化学习的AUV跟踪控制器在面临新任务时需从零开始训练、训练速度慢、稳定性差等问题,设计一种基于元强化学习的AUV多任务快速自适应控制算法——R-SAC(Reptile-Soft Actor Critic)算法。R-SAC算法将元学习与强化学习相结合,结合水下机器人运动学及动力学方程对跟踪任务进行建模,利用RSAC算法在训练阶段为AUV跟踪控制器获得一组最优初始值模型参数,使模型在面临不同的任务时,基于该组参数进行训练时能够快速收敛,实现快速自适应不同任务。仿真结果表明,所提出的方法与随机初始化强化学习控制器相比,收敛速度最低提高了1.6倍,跟踪误差保持在2.8%以内。展开更多
Autonomous undersea vehicle (AUV) is a typical complex engineering system. This paper studies the disciplines and coupled variables in AUV design with multidisciplinary design optimization (MDO) methods. The framework...Autonomous undersea vehicle (AUV) is a typical complex engineering system. This paper studies the disciplines and coupled variables in AUV design with multidisciplinary design optimization (MDO) methods. The framework of AUV synthetic conceptual design is described first, and then a model with collaborative optimization is studied. At last, an example is given to verify the validity and efficiency of MDO in AUV synthetic conceptual design.展开更多
文摘为解决基于深度强化学习的AUV跟踪控制器在面临新任务时需从零开始训练、训练速度慢、稳定性差等问题,设计一种基于元强化学习的AUV多任务快速自适应控制算法——R-SAC(Reptile-Soft Actor Critic)算法。R-SAC算法将元学习与强化学习相结合,结合水下机器人运动学及动力学方程对跟踪任务进行建模,利用RSAC算法在训练阶段为AUV跟踪控制器获得一组最优初始值模型参数,使模型在面临不同的任务时,基于该组参数进行训练时能够快速收敛,实现快速自适应不同任务。仿真结果表明,所提出的方法与随机初始化强化学习控制器相比,收敛速度最低提高了1.6倍,跟踪误差保持在2.8%以内。
文摘Autonomous undersea vehicle (AUV) is a typical complex engineering system. This paper studies the disciplines and coupled variables in AUV design with multidisciplinary design optimization (MDO) methods. The framework of AUV synthetic conceptual design is described first, and then a model with collaborative optimization is studied. At last, an example is given to verify the validity and efficiency of MDO in AUV synthetic conceptual design.