Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Juravsky, Jordan, Guo, Yunrong, Fidler, Sanja, Peng, Xue Bin
Formato:	Preprint
Publicado:	2024
Materias:	Machine Learning Artificial Intelligence Computation and Language Graphics
Acceso en línea:	https://arxiv.org/abs/2407.10481
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866916323487907840
author	Juravsky, Jordan Guo, Yunrong Fidler, Sanja Peng, Xue Bin
author_facet	Juravsky, Jordan Guo, Yunrong Fidler, Sanja Peng, Xue Bin
contents	Physically-simulated models for human motion can generate high-quality responsive character animations, often in real-time. Natural language serves as a flexible interface for controlling these models, allowing expert and non-expert users to quickly create and edit their animations. Many recent physics-based animation methods, including those that use text interfaces, train control policies using reinforcement learning (RL). However, scaling these methods beyond several hundred motions has remained challenging. Meanwhile, kinematic animation models are able to successfully learn from thousands of diverse motions by leveraging supervised learning methods. Inspired by these successes, in this work we introduce SuperPADL, a scalable framework for physics-based text-to-motion that leverages both RL and supervised learning to train controllers on thousands of diverse motion clips. SuperPADL is trained in stages using progressive distillation, starting with a large number of specialized experts using RL. These experts are then iteratively distilled into larger, more robust policies using a combination of reinforcement learning and supervised learning. Our final SuperPADL controller is trained on a dataset containing over 5000 skills and runs in real time on a consumer GPU. Moreover, our policy can naturally transition between skills, allowing for users to interactively craft multi-stage animations. We experimentally demonstrate that SuperPADL significantly outperforms RL-based baselines at this large data scale.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_10481
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation Juravsky, Jordan Guo, Yunrong Fidler, Sanja Peng, Xue Bin Machine Learning Artificial Intelligence Computation and Language Graphics Physically-simulated models for human motion can generate high-quality responsive character animations, often in real-time. Natural language serves as a flexible interface for controlling these models, allowing expert and non-expert users to quickly create and edit their animations. Many recent physics-based animation methods, including those that use text interfaces, train control policies using reinforcement learning (RL). However, scaling these methods beyond several hundred motions has remained challenging. Meanwhile, kinematic animation models are able to successfully learn from thousands of diverse motions by leveraging supervised learning methods. Inspired by these successes, in this work we introduce SuperPADL, a scalable framework for physics-based text-to-motion that leverages both RL and supervised learning to train controllers on thousands of diverse motion clips. SuperPADL is trained in stages using progressive distillation, starting with a large number of specialized experts using RL. These experts are then iteratively distilled into larger, more robust policies using a combination of reinforcement learning and supervised learning. Our final SuperPADL controller is trained on a dataset containing over 5000 skills and runs in real time on a consumer GPU. Moreover, our policy can naturally transition between skills, allowing for users to interactively craft multi-stage animations. We experimentally demonstrate that SuperPADL significantly outperforms RL-based baselines at this large data scale.
title	SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation
topic	Machine Learning Artificial Intelligence Computation and Language Graphics
url	https://arxiv.org/abs/2407.10481

Ejemplares similares