Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Gao, Li, Yang, Fuzhi, Chen, Jianhui, Liu, Liu, Zheng, Yao, Cai, Yang, Li, Ziqiao
Format:	Preprint
Published:	2026
Subjects:	Robotics
Online Access:	https://arxiv.org/abs/2603.24021
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914420539523072
author	Gao, Li Yang, Fuzhi Chen, Jianhui Liu, Liu Zheng, Yao Cai, Yang Li, Ziqiao
author_facet	Gao, Li Yang, Fuzhi Chen, Jianhui Liu, Liu Zheng, Yao Cai, Yang Li, Ziqiao
contents	Despite significant advances in quadrupedal robotics, a critical gap persists in foundational motion resources that holistically integrate diverse locomotion, emotionally expressive behaviors, and rich language semantics-essential for agile, intuitive human-robot interaction. Current quadruped motion datasets are limited to a few mocap primitives (e.g., walk, trot, sit) and lack diverse behaviors with rich language grounding. To bridge this gap, we introduce Quadruped Foundational Motion (QuadFM) , the first large-scale, ultra-high-fidelity dataset designed for text-to-motion generation and general motion control. QuadFM contains 11,784 curated motion clips spanning locomotion, interactive, and emotion-expressive behaviors (e.g., dancing, stretching, peeing), each with three-layer annotation-fine-grained action labels, interaction scenarios, and natural language commands-totaling 35,352 descriptions to support language-conditioned understanding and command execution. We further propose Gen2Control RL, a unified framework that jointly trains a general motion controller and a text-to-motion generator, enabling efficient end-to-end inference on edge hardware. On a real quadruped robot with an NVIDIA Orin, our system achieves real-time motion synthesis (<500 ms latency). Simulation and real-world results show realistic, diverse motions while maintaining robust physical interaction. The dataset will be released at https://github.com/GaoLii/QuadFM.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_24021
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	QuadFM: Foundational Text-Driven Quadruped Motion Dataset for Generation and Control Gao, Li Yang, Fuzhi Chen, Jianhui Liu, Liu Zheng, Yao Cai, Yang Li, Ziqiao Robotics Despite significant advances in quadrupedal robotics, a critical gap persists in foundational motion resources that holistically integrate diverse locomotion, emotionally expressive behaviors, and rich language semantics-essential for agile, intuitive human-robot interaction. Current quadruped motion datasets are limited to a few mocap primitives (e.g., walk, trot, sit) and lack diverse behaviors with rich language grounding. To bridge this gap, we introduce Quadruped Foundational Motion (QuadFM) , the first large-scale, ultra-high-fidelity dataset designed for text-to-motion generation and general motion control. QuadFM contains 11,784 curated motion clips spanning locomotion, interactive, and emotion-expressive behaviors (e.g., dancing, stretching, peeing), each with three-layer annotation-fine-grained action labels, interaction scenarios, and natural language commands-totaling 35,352 descriptions to support language-conditioned understanding and command execution. We further propose Gen2Control RL, a unified framework that jointly trains a general motion controller and a text-to-motion generator, enabling efficient end-to-end inference on edge hardware. On a real quadruped robot with an NVIDIA Orin, our system achieves real-time motion synthesis (<500 ms latency). Simulation and real-world results show realistic, diverse motions while maintaining robust physical interaction. The dataset will be released at https://github.com/GaoLii/QuadFM.
title	QuadFM: Foundational Text-Driven Quadruped Motion Dataset for Generation and Control
topic	Robotics
url	https://arxiv.org/abs/2603.24021

Similar Items