Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yu, Kelin, Zhang, Sheng, Soora, Harshit, Huang, Furong, Huang, Heng, Tokekar, Pratap, Gao, Ruohan
Format:	Preprint
Published:	2025
Subjects:	Robotics Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2508.11049
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916901180932096
author	Yu, Kelin Zhang, Sheng Soora, Harshit Huang, Furong Huang, Heng Tokekar, Pratap Gao, Ruohan
author_facet	Yu, Kelin Zhang, Sheng Soora, Harshit Huang, Furong Huang, Heng Tokekar, Pratap Gao, Ruohan
contents	Recent advances have shown that video generation models can enhance robot learning by deriving effective robot actions through inverse dynamics. However, these methods heavily depend on the quality of generated data and struggle with fine-grained manipulation due to the lack of environment feedback. While video-based reinforcement learning improves policy robustness, it remains constrained by the uncertainty of video generation and the challenges of collecting large-scale robot datasets for training diffusion models. To address these limitations, we propose GenFlowRL, which derives shaped rewards from generated flow trained from diverse cross-embodiment datasets. This enables learning generalizable and robust policies from diverse demonstrations using low-dimensional, object-centric features. Experiments on 10 manipulation tasks, both in simulation and real-world cross-embodiment evaluations, demonstrate that GenFlowRL effectively leverages manipulation features extracted from generated object-centric flow, consistently achieving superior performance across diverse and challenging scenarios. Our Project Page: https://colinyu1.github.io/genflowrl
format	Preprint
id	arxiv_https___arxiv_org_abs_2508_11049
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning Yu, Kelin Zhang, Sheng Soora, Harshit Huang, Furong Huang, Heng Tokekar, Pratap Gao, Ruohan Robotics Computer Vision and Pattern Recognition Recent advances have shown that video generation models can enhance robot learning by deriving effective robot actions through inverse dynamics. However, these methods heavily depend on the quality of generated data and struggle with fine-grained manipulation due to the lack of environment feedback. While video-based reinforcement learning improves policy robustness, it remains constrained by the uncertainty of video generation and the challenges of collecting large-scale robot datasets for training diffusion models. To address these limitations, we propose GenFlowRL, which derives shaped rewards from generated flow trained from diverse cross-embodiment datasets. This enables learning generalizable and robust policies from diverse demonstrations using low-dimensional, object-centric features. Experiments on 10 manipulation tasks, both in simulation and real-world cross-embodiment evaluations, demonstrate that GenFlowRL effectively leverages manipulation features extracted from generated object-centric flow, consistently achieving superior performance across diverse and challenging scenarios. Our Project Page: https://colinyu1.github.io/genflowrl
title	GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning
topic	Robotics Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2508.11049

Similar Items