Enregistré dans:
Détails bibliographiques
Auteurs principaux: Weng, Yueyang, Zhang, Xiaopeng, Mu, Yongjin, Zhu, Yingcong, Li, Yanjie, Liu, Qi
Format: Preprint
Publié: 2025
Sujets:
Accès en ligne:https://arxiv.org/abs/2511.04421
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866915602632802304
author Weng, Yueyang
Zhang, Xiaopeng
Mu, Yongjin
Zhu, Yingcong
Li, Yanjie
Liu, Qi
author_facet Weng, Yueyang
Zhang, Xiaopeng
Mu, Yongjin
Zhu, Yingcong
Li, Yanjie
Liu, Qi
contents Action chunking is a widely adopted approach in Learning from Demonstration (LfD). By modeling multi-step action chunks rather than single-step actions, action chunking significantly enhances modeling capabilities for human expert policies. However, the reduced decision frequency restricts the utilization of recent observations, degrading reactivity - particularly evident in the inadequate adaptation to sensor noise and dynamic environmental changes. Existing efforts to address this issue have primarily resorted to trading off reactivity against decision consistency, without achieving both. To address this limitation, we propose a novel algorithm, Temporal Action Selector (TAS), which caches predicted action chunks from multiple timesteps and dynamically selects the optimal action through a lightweight selector network. TAS achieves balanced optimization across three critical dimensions: reactivity, decision consistency, and motion coherence. Experiments across multiple tasks with diverse base policies show that TAS significantly improves success rates - yielding an absolute gain of up to 73.3%. Furthermore, integrating TAS as a base policy with residual reinforcement learning (RL) substantially enhances training efficiency and elevates the performance plateau. Experiments in both simulation and physical robots confirm the method's efficacy.
format Preprint
id arxiv_https___arxiv_org_abs_2511_04421
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Temporal Action Selection for Action Chunking
Weng, Yueyang
Zhang, Xiaopeng
Mu, Yongjin
Zhu, Yingcong
Li, Yanjie
Liu, Qi
Robotics
Action chunking is a widely adopted approach in Learning from Demonstration (LfD). By modeling multi-step action chunks rather than single-step actions, action chunking significantly enhances modeling capabilities for human expert policies. However, the reduced decision frequency restricts the utilization of recent observations, degrading reactivity - particularly evident in the inadequate adaptation to sensor noise and dynamic environmental changes. Existing efforts to address this issue have primarily resorted to trading off reactivity against decision consistency, without achieving both. To address this limitation, we propose a novel algorithm, Temporal Action Selector (TAS), which caches predicted action chunks from multiple timesteps and dynamically selects the optimal action through a lightweight selector network. TAS achieves balanced optimization across three critical dimensions: reactivity, decision consistency, and motion coherence. Experiments across multiple tasks with diverse base policies show that TAS significantly improves success rates - yielding an absolute gain of up to 73.3%. Furthermore, integrating TAS as a base policy with residual reinforcement learning (RL) substantially enhances training efficiency and elevates the performance plateau. Experiments in both simulation and physical robots confirm the method's efficacy.
title Temporal Action Selection for Action Chunking
topic Robotics
url https://arxiv.org/abs/2511.04421