Reinforcement Learning Representation Algorithm Combining Forward State Prediction and Latent Space Regularization

doi:10.15888/j.cnki.csa.008801

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-25- 13

Home > Archive>Volume 31, Issue 11, 2022 >148-156. DOI:10.15888/j.cnki.csa.008801

PDF HTML XML Export Cite reminder

Reinforcement Learning Representation Algorithm Combining Forward State Prediction and Latent Space Regularization
DOI:
                        10.15888/j.cnki.csa.008801
                    
CSTR:
                        [cstr]
                    
Author:
                        XIANG YuXIANG Yu
College of Computer Science & Technology, Guizhou University, Guiyang 550025, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
QIN JinQIN Jin
College of Computer Science & Technology, Guizhou University, Guiyang 550025, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
YUAN Lin-LinYUAN Lin-Lin
College of Information Engineering, Guizhou Open University, Guiyang 550023, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Although deep reinforcement learning can solve many complex control problems, it needs to pay the cost of a large number of interactions with the environment, which is a major challenge for deep reinforcement learning. One of the reasons for this problem is that it is difficult for an agent to extract effective features from a high-dimensional complex input only by relying on the loss of value function. As a result, the agent has an insufficient understanding of the state and cannot correctly assign value to the state. Therefore, this study proposes a regularized predictive representation learning (RPRL) method combining forward state prediction and latent space constraint to make agents know the environment and improve the sample efficiency of reinforcement learning. The method helps agents to learn and extract state features from high-dimensional visual observations to improve the sample efficiency of reinforcement learning. The forward state transfer loss is used as the auxiliary loss so that the features learned by agents contain dynamic information related to environmental transition. At the same time, the state representation of latent space is regularized on the basis of forward prediction, which further helps the agent to learn the smooth and regular representation of the high-dimensional input. In DeepMind Control (DMControl) environment, the proposed method achieves better performance than other model-based methods and model-free methods with representation learning.

Key words:reinforcement learning;representation method;state transition;latent space constraint;continuous control;high dimensional input

Get Citation

项宇,秦进,袁琳琳.结合向前状态预测和隐空间约束的强化学习表示算法.计算机系统应用,2022,31(11):148-156

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:February 12,2022
Revised:March 23,2022
Adopted:
Online: July 15,2022
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063