期刊文献+

基于样本独特性的强化学习经验回放机制

Reinforcement Learning Experience Replay Mechanism Based on Sample Distinctiveness
在线阅读 下载PDF
导出
摘要 在深度强化学习领域,特别是在高维连续的任务中,如何高效利用有限的训练数据,避免过拟合,同时提高模型的泛化能力,是一个重要的研究课题.传统的强化学习算法通常采用单一经验池机制,这种方法在处理高维连续状态和动作空间时,往往面临探索效率低下和样本利用率不足的问题.一种基于样本独特性的强化学习经验回放机制DER(distinctive experience replay)被提出,该机制通过选择具有显著独特性的样本进行经验回放,DER的核心思想是在训练过程中识别并选择具有显著独特性的样本,将其存储在专门的独特性样本经验池中.该机制不仅能够有效利用多样化的样本,避免神经网路过拟合,还能提高智能体在复杂环境中的学习效率和决策质量.实验结果表明,DER在经典强化学习环境中显著提高了智能体的学习效率和最终性能. In the field of deep reinforcement learning,particularly for high-dimensional continuous tasks,efficiently utilizing limited training data,preventing overfitting,and enhancing the model’s generalization ability are crucial research challenges.Traditional reinforcement learning algorithms typically rely on a single experience replay buffer,which often faces low exploration efficiency and insufficient sample utilization,when applied to high-dimensional continuous state and action spaces.A reinforcement learning experience replay mechanism based on sample distinctiveness called distinctive experience replay(DER)is proposed.This mechanism selects samples with notable distinctiveness for experience replay.The core concept of DER is to identify and select significantly distinctive samples during training and store them in a dedicated experience pool.This mechanism not only effectively utilizes diverse samples to prevent neural network overfitting but also enhances the agent’s learning efficiency and decision-making quality in complex environments.Experimental results show that DER significantly improves the agent’s learning efficiency and final performance in classic reinforcement learning environments.
作者 周梓芸 孔燕 ZHOU Zi-Yun;KONG Yan(School of Software,Nanjing University of Information Science&Technology,Nanjing 210044,China)
出处 《计算机系统应用》 2025年第8期228-236,共9页 Computer Systems & Applications
关键词 深度强化学习 经验回放 样本效率 双经验池机制 deep reinforcement learning experience replay sample efficiency dual experience replay mechanism
  • 相关文献

参考文献1

二级参考文献5

共引文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部