Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Yuhan, Ma, Guoqing, Hao, Guangfu, Guo, Liangxuan, Chen, Yang, Yu, Shan
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2502.05555
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912224550846464
author	Zhang, Yuhan Ma, Guoqing Hao, Guangfu Guo, Liangxuan Chen, Yang Yu, Shan
author_facet	Zhang, Yuhan Ma, Guoqing Hao, Guangfu Guo, Liangxuan Chen, Yang Yu, Shan
contents	While Reinforcement Learning (RL) agents can successfully learn to handle complex tasks, effectively generalizing acquired skills to unfamiliar settings remains a challenge. One of the reasons behind this is the visual encoders used are task-dependent, preventing effective feature extraction in different settings. To address this issue, recent studies have tried to pretrain encoders with diverse visual inputs in order to improve their performance. However, they rely on existing pretrained encoders without further exploring the impact of pretraining period. In this work, we propose APE: efficient reinforcement learning through Adaptively Pretrained visual Encoder -- a framework that utilizes adaptive augmentation strategy during the pretraining phase and extracts generalizable features with only a few interactions within the task environments in the policy learning period. Experiments are conducted across various domains, including DeepMind Control Suite, Atari Games and Memory Maze benchmarks, to verify the effectiveness of our method. Results show that mainstream RL methods, such as DreamerV3 and DrQ-v2, achieve state-of-the-art performance when equipped with APE. In addition, APE significantly improves the sampling efficiency using only visual inputs during learning, approaching the efficiency of state-based method in several control tasks. These findings demonstrate the potential of adaptive pretraining of encoder in enhancing the generalization ability and efficiency of visual RL algorithms.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_05555
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Efficient Reinforcement Learning Through Adaptively Pretrained Visual Encoder Zhang, Yuhan Ma, Guoqing Hao, Guangfu Guo, Liangxuan Chen, Yang Yu, Shan Computer Vision and Pattern Recognition While Reinforcement Learning (RL) agents can successfully learn to handle complex tasks, effectively generalizing acquired skills to unfamiliar settings remains a challenge. One of the reasons behind this is the visual encoders used are task-dependent, preventing effective feature extraction in different settings. To address this issue, recent studies have tried to pretrain encoders with diverse visual inputs in order to improve their performance. However, they rely on existing pretrained encoders without further exploring the impact of pretraining period. In this work, we propose APE: efficient reinforcement learning through Adaptively Pretrained visual Encoder -- a framework that utilizes adaptive augmentation strategy during the pretraining phase and extracts generalizable features with only a few interactions within the task environments in the policy learning period. Experiments are conducted across various domains, including DeepMind Control Suite, Atari Games and Memory Maze benchmarks, to verify the effectiveness of our method. Results show that mainstream RL methods, such as DreamerV3 and DrQ-v2, achieve state-of-the-art performance when equipped with APE. In addition, APE significantly improves the sampling efficiency using only visual inputs during learning, approaching the efficiency of state-based method in several control tasks. These findings demonstrate the potential of adaptive pretraining of encoder in enhancing the generalization ability and efficiency of visual RL algorithms.
title	Efficient Reinforcement Learning Through Adaptively Pretrained Visual Encoder
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2502.05555

Similar Items