Samples Expanding of Deep Q Network Based on Genetic Crossover Operator
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Different from the traditional deep reinforcement learning method of training through transitions selected one by one from the experience replay, for the Deep Q Network (DQN) that uses the entire episode trajectory as the training sample, a method for expanding episode samples is proposed, which is based on genetic algorithm crossover operators. The episode trajectory is generated during the trial-and-error decision-making process of the interaction between the agent and the environment, in which similar key states will be encountered. With the similar state in the two episode trajectories as the intersection point, the episode trajectory that has not appeared till present can be generated to enlarge the number of episode samples and increase their diversity, thereby enhancing the agent’s exploration ability and improving sample efficiency. Compared with DQN that randomly selects samples and uses the Episodic Backward Update (EBU) algorithm, the proposed method can achieve higher rewards in the Playing Atari 2600.

    Reference
    Related
    Cited by
Get Citation

杨彤,秦进,谢仲涛,袁琳琳.基于遗传交叉算子的深度Q网络样本扩充.计算机系统应用,2021,30(12):155-162

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:February 22,2021
  • Revised:March 19,2021
  • Adopted:
  • Online: December 10,2021
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063