摘要
自动驾驶汽车的智能化是推进汽车产业转型升级的关键,其中轨迹跟踪避撞技术对确保自动驾驶汽车行驶安全至关重要。本研究针对现有强化学习型控制方法探索不充分问题,提出了一种扩散型强化学习算法。通过将扩散模型与强化学习框架相结合,把传统策略网络替换为扩散式生成策略网络,将扩散模型的多模态分布匹配能力引入强化学习中,并与值分布柔性执行-评价算法结合,提出了扩散型值分布执行-评价算法。仿真与实车试验表明,所提算法展现出较高的探索效率,实车横向平均跟踪误差小于0.03 m,速度平均跟踪误差小于0.05 m/s,验证了算法的优越性。
The intelligence of autonomous vehicles is key to upgrading of the automotive industry,where trajectory tracking and collision avoidance technologies are crucial for ensuring vehicle safety.In this paper,for the problem of insufficient exploration of existing reinforcement learning control methods,a diffusion reinforcement learning algorithm is proposed.By combining diffusion models with reinforcement learning frameworks and replacing traditional policy networks with diffusion generative policy networks,introducing the multimodal distribution matching capability of diffusion models into reinforcement learning,and combining it with the distributional soft actor-critic algorithm,a diffusion distributional actor-critic algorithm(DDAC)is proposed.Simulation and real-vehicle experiments demonstrate that the proposed algorithm exhibits high exploration efficiency,with real vehicle lateral tracking error less than 0.03 m and velocity tracking error less than 0.05 m/s,verifying the superiority of the algorithm.
作者
赵俊杰
王以诺
吴江
吴思潮
邹昌迪
王洪达
李升波
马飞
段京良
Zhao Junjie;Wang Yinuo;Wu Jiang;Wu Sichao;Zou Changdi;Wang Hongda;Li ShengboEben;Ma Fei;Duan Jingliang(School of Mechanical Engineering,University of Science and Technology Beijing,Beijing 100083;School of Vehicle and Mobility,Tsinghua University,Beijing 100084)
出处
《汽车工程》
北大核心
2025年第8期1490-1500,共11页
Automotive Engineering
基金
国家自然科学基金(52202487,62273256)
中央高校基本科研业务费专项资金项目(FRF-OT-23-02)资助。
关键词
轨迹跟踪
主动避撞
值分布强化学习
扩散模型
trajectory tracking
active collision avoidance
distributional reinforcement learning
diffusion model