Saved in:
Bibliographic Details
Main Authors: Lin, Chenxing, Gao, Xinhui, Zhang, Haipeng, Li, Xinran, Wang, Haitao, Mei, Songzhu, Wen, Chenglu, Liu, Weiquan, Shen, Siqi, Wang, Cheng
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.23770
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Generative models have gained significant traction in offline reinforcement learning (RL) due to their ability to model complex trajectory distributions. However, existing generation-based approaches still struggle with long-horizon tasks characterized by sparse rewards. Some hierarchical generation methods have been developed to mitigate this issue by decomposing the original problem into shorter-horizon subproblems using one policy and generating detailed actions with another. While effective, these methods often overlook the multi-scale temporal structure inherent in trajectories, resulting in suboptimal performance. To overcome these limitations, we propose MAGE, a Multi-scale Autoregressive GEneration-based offline RL method. MAGE incorporates a condition-guided multi-scale autoencoder to learn hierarchical trajectory representations, along with a multi-scale transformer that autoregressively generates trajectory representations from coarse to fine temporal scales. MAGE effectively captures temporal dependencies of trajectories at multiple resolutions. Additionally, a condition-guided decoder is employed to exert precise control over short-term behaviors. Extensive experiments on five offline RL benchmarks against fifteen baseline algorithms show that MAGE successfully integrates multi-scale trajectory modeling with conditional guidance, generating coherent and controllable trajectories in long-horizon sparse-reward settings.