Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Puyue, Hu, Jiawei, Gao, Yan, Wang, Junyan, Zhang, Yu, Dobbie, Gillian, Gu, Tao, Johal, Wafa, Dang, Ting, Jia, Hong
Format:	Preprint
Published:	2026
Subjects:	Robotics Machine Learning
Online Access:	https://arxiv.org/abs/2602.04412
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908887089676288
author	Wang, Puyue Hu, Jiawei Gao, Yan Wang, Junyan Zhang, Yu Dobbie, Gillian Gu, Tao Johal, Wafa Dang, Ting Jia, Hong
author_facet	Wang, Puyue Hu, Jiawei Gao, Yan Wang, Junyan Zhang, Yu Dobbie, Gillian Gu, Tao Johal, Wafa Dang, Ting Jia, Hong
contents	Humanoid robots can suffer significant performance drops under small changes in dynamics, task specifications, or environment setup. We propose HoRD, a two-stage learning framework for robust humanoid control under domain shift. First, we train a high-performance teacher policy via history-conditioned reinforcement learning, where the policy infers latent dynamics context from recent state--action trajectories to adapt online to diverse randomized dynamics. Second, we perform online distillation to transfer the teacher's robust control capabilities into a transformer-based student policy that operates on sparse root-relative 3D joint keypoint trajectories. By combining history-conditioned adaptation with online distillation, HoRD enables a single policy to adapt zero-shot to unseen domains without per-domain retraining. Extensive experiments show HoRD outperforms strong baselines in robustness and transfer, especially under unseen domains and external perturbations. Code and project page are available at https://tonywang-0517.github.io/hord/.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_04412
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	HoRD: Robust Humanoid Control via History-Conditioned Reinforcement Learning and Online Distillation Wang, Puyue Hu, Jiawei Gao, Yan Wang, Junyan Zhang, Yu Dobbie, Gillian Gu, Tao Johal, Wafa Dang, Ting Jia, Hong Robotics Machine Learning Humanoid robots can suffer significant performance drops under small changes in dynamics, task specifications, or environment setup. We propose HoRD, a two-stage learning framework for robust humanoid control under domain shift. First, we train a high-performance teacher policy via history-conditioned reinforcement learning, where the policy infers latent dynamics context from recent state--action trajectories to adapt online to diverse randomized dynamics. Second, we perform online distillation to transfer the teacher's robust control capabilities into a transformer-based student policy that operates on sparse root-relative 3D joint keypoint trajectories. By combining history-conditioned adaptation with online distillation, HoRD enables a single policy to adapt zero-shot to unseen domains without per-domain retraining. Extensive experiments show HoRD outperforms strong baselines in robustness and transfer, especially under unseen domains and external perturbations. Code and project page are available at https://tonywang-0517.github.io/hord/.
title	HoRD: Robust Humanoid Control via History-Conditioned Reinforcement Learning and Online Distillation
topic	Robotics Machine Learning
url	https://arxiv.org/abs/2602.04412

Similar Items