摘要
针对当前基于强化学习的匝道控制方法对策略训练中的学习成本、策略迁移性等研究不充分,导致控制策略难以在实际中应用的问题,该文提出一种匝道控制策略优化的强化学习方法,并通过大量仿真实验对方法的可移植性进行了深入研究。构建匝道控制模型,提出基于深度强化学习的模型训练方法;选取雄安新区对外主干路网中荣乌高速公路某合流区瓶颈作为实验场景,利用深度强化学习算法对模型进行训练,并将训练过程中控制策略的表现与经典匝道控制方法比较,从而对学习成本进行量化分析;选取不同仿真模型及多组模型参数作为测试环境,分析训练环境与测试环境差异对控制策略的影响。结果表明:当训练环境与测试环境差异程度在20%以内时,强化学习控制方法在提升通行效率方面显著优于经典匝道控制方法;而当差异程度超过20%时,两种方法效果差异不明显。
Given that current research on ramp control methods based on reinforcement learning(RL)has not thoroughly addressed key issues such as learning cost and policy transferability during policy training,the practical application of these control strategies remains challenging.To address this issue,this paper proposed a RL approach aimed at optimizing ramp control strategies and conducted extensive simulation experiments to investigate the portability of the proposed method.A ramp control model was constructed,and a model training method based on deep reinforcement learning was proposed.The bottleneck in a certain convergence area of Rongwu Expressway in the main external road network of Xiongan District was selected as the experimental scenario.The deep RL algorithm was used to train the ramp metering model,and the performance of the control strategy during the training process was compared with the classical ramp control method,thereby quantitatively analyzing the learning cost.Different simulation models and multiple sets of model parameters were selected as the test environment,and the influence of the differences between the training environment and the test environment on the control strategy was analyzed.The results show that when the difference between the training environment and the test environment is within 20%,the RL control method is significantly superior to the classical ramp control method in improving the traffic efficiency.However,when the difference exceeds 20%,the effects of the two methods are comparable.
作者
韩雨
陈志轩
王翊萱
李春杰
雷伟
焦彦利
刘攀
HAN Yu;CHEN Zhixuan;WANG Yixuan;LI Chunjie;LEI Wei;JIAO Yanli;LIU Pan(School of Transportation,Southeast University,Nanjing 211189,China;Hebei Provincial Communications Planning,Design and Research Institute Co.Ltd.,Research and Development Center of Transport Industry of Self-Driving Technology,Shijiazhuang 050011,China)
出处
《汽车安全与节能学报》
北大核心
2025年第4期587-597,共11页
Journal of Automotive Safety and Energy
基金
国家自然科学基金资助项目(52232012,52402384,52131203)。
关键词
匝道控制
强化学习
迁移性
学习成本
ramp metering
reinforcement learning
transferability
learning cost