Saved in:
Bibliographic Details
Main Authors: Ma, Yubiao, Yu, Han, Xie, Jiayin, Lv, Changtai, Luo, Qiang, Zhang, Chi, Yin, Yunpeng, Xing, Boyang, Ren, Xuemei, Zheng, Dongdong
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.23080
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908800584253440
author Ma, Yubiao
Yu, Han
Xie, Jiayin
Lv, Changtai
Luo, Qiang
Zhang, Chi
Yin, Yunpeng
Xing, Boyang
Ren, Xuemei
Zheng, Dongdong
author_facet Ma, Yubiao
Yu, Han
Xie, Jiayin
Lv, Changtai
Luo, Qiang
Zhang, Chi
Yin, Yunpeng
Xing, Boyang
Ren, Xuemei
Zheng, Dongdong
contents Learning a general humanoid whole-body controller is challenging because practical reference motions can exhibit noise and inconsistencies after being transferred to the robot domain, and local defects may be amplified by closed-loop execution, causing drift or failure in highly dynamic and contact-rich behaviors. We propose a dynamics-conditioned command aggregation framework that uses a causal temporal encoder to summarize recent proprioception and a multi-head cross-attention command encoder to selectively aggregate a context window based on the current dynamics. We further integrate a fall recovery curriculum with random unstable initialization and an annealed upward assistance force to improve robustness and disturbance rejection. The resulting policy requires only about 3.5 hours of motion data and supports single-stage end-to-end training without distillation. The proposed method is evaluated under diverse reference inputs and challenging motion regimes, demonstrating zero-shot transfer to unseen motions as well as robust sim-to-real transfer on a physical humanoid robot.
format Preprint
id arxiv_https___arxiv_org_abs_2601_23080
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Robust and Generalized Humanoid Motion Tracking
Ma, Yubiao
Yu, Han
Xie, Jiayin
Lv, Changtai
Luo, Qiang
Zhang, Chi
Yin, Yunpeng
Xing, Boyang
Ren, Xuemei
Zheng, Dongdong
Robotics
Learning a general humanoid whole-body controller is challenging because practical reference motions can exhibit noise and inconsistencies after being transferred to the robot domain, and local defects may be amplified by closed-loop execution, causing drift or failure in highly dynamic and contact-rich behaviors. We propose a dynamics-conditioned command aggregation framework that uses a causal temporal encoder to summarize recent proprioception and a multi-head cross-attention command encoder to selectively aggregate a context window based on the current dynamics. We further integrate a fall recovery curriculum with random unstable initialization and an annealed upward assistance force to improve robustness and disturbance rejection. The resulting policy requires only about 3.5 hours of motion data and supports single-stage end-to-end training without distillation. The proposed method is evaluated under diverse reference inputs and challenging motion regimes, demonstrating zero-shot transfer to unseen motions as well as robust sim-to-real transfer on a physical humanoid robot.
title Robust and Generalized Humanoid Motion Tracking
topic Robotics
url https://arxiv.org/abs/2601.23080