基于样本独特性的强化学习经验回放机制
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


Reinforcement Learning Experience Replay Mechanism Based on Sample Distinctiveness
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在深度强化学习领域, 特别是在高维连续的任务中, 如何高效利用有限的训练数据, 避免过拟合, 同时提高模型的泛化能力, 是一个重要的研究课题. 传统的强化学习算法通常采用单一经验池机制, 这种方法在处理高维连续状态和动作空间时, 往往面临探索效率低下和样本利用率不足的问题. 一种基于样本独特性的强化学习经验回放机制DER (distinctive experience replay)被提出, 该机制通过选择具有显著独特性的样本进行经验回放, DER的核心思想是在训练过程中识别并选择具有显著独特性的样本, 将其存储在专门的独特性样本经验池中. 该机制不仅能够有效利用多样化的样本, 避免神经网路过拟合, 还能提高智能体在复杂环境中的学习效率和决策质量. 实验结果表明, DER在经典强化学习环境中显著提高了智能体的学习效率和最终性能.

    Abstract:

    In the field of deep reinforcement learning, particularly for high-dimensional continuous tasks, efficiently utilizing limited training data, preventing overfitting, and enhancing the model’s generalization ability are crucial research challenges. Traditional reinforcement learning algorithms typically rely on a single experience replay buffer, which, when applied to high-dimensional continuous state and action spaces, often faces low exploration efficiency and insufficient sample utilization. A reinforcement learning experience replay mechanism based on sample distinctiveness called distinctive experience replay (DER) is proposed. This mechanism selects samples with notable distinctiveness for experience replay. The core concept of DER is to identify and select significantly distinctive samples during training and store them in a dedicated experience pool. This mechanism not only effectively utilizes diverse samples to prevent neural network overfitting but also enhances the agent’s learning efficiency and decision-making quality in complex environments. Experimental results show that DER significantly improves the agent’s learning efficiency and final performance in classic reinforcement learning environments.

    参考文献
    相似文献
    引证文献
引用本文

周梓芸,孔燕.基于样本独特性的强化学习经验回放机制.计算机系统应用,,():1-9

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-12-09
  • 最后修改日期:2025-01-02
  • 录用日期:
  • 在线发布日期: 2025-04-25
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号