Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.11049 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866916901180932096 |
|---|---|
| author | Yu, Kelin Zhang, Sheng Soora, Harshit Huang, Furong Huang, Heng Tokekar, Pratap Gao, Ruohan |
| author_facet | Yu, Kelin Zhang, Sheng Soora, Harshit Huang, Furong Huang, Heng Tokekar, Pratap Gao, Ruohan |
| contents | Recent advances have shown that video generation models can enhance robot learning by deriving effective robot actions through inverse dynamics. However, these methods heavily depend on the quality of generated data and struggle with fine-grained manipulation due to the lack of environment feedback. While video-based reinforcement learning improves policy robustness, it remains constrained by the uncertainty of video generation and the challenges of collecting large-scale robot datasets for training diffusion models. To address these limitations, we propose GenFlowRL, which derives shaped rewards from generated flow trained from diverse cross-embodiment datasets. This enables learning generalizable and robust policies from diverse demonstrations using low-dimensional, object-centric features. Experiments on 10 manipulation tasks, both in simulation and real-world cross-embodiment evaluations, demonstrate that GenFlowRL effectively leverages manipulation features extracted from generated object-centric flow, consistently achieving superior performance across diverse and challenging scenarios. Our Project Page: https://colinyu1.github.io/genflowrl |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2508_11049 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning Yu, Kelin Zhang, Sheng Soora, Harshit Huang, Furong Huang, Heng Tokekar, Pratap Gao, Ruohan Robotics Computer Vision and Pattern Recognition Recent advances have shown that video generation models can enhance robot learning by deriving effective robot actions through inverse dynamics. However, these methods heavily depend on the quality of generated data and struggle with fine-grained manipulation due to the lack of environment feedback. While video-based reinforcement learning improves policy robustness, it remains constrained by the uncertainty of video generation and the challenges of collecting large-scale robot datasets for training diffusion models. To address these limitations, we propose GenFlowRL, which derives shaped rewards from generated flow trained from diverse cross-embodiment datasets. This enables learning generalizable and robust policies from diverse demonstrations using low-dimensional, object-centric features. Experiments on 10 manipulation tasks, both in simulation and real-world cross-embodiment evaluations, demonstrate that GenFlowRL effectively leverages manipulation features extracted from generated object-centric flow, consistently achieving superior performance across diverse and challenging scenarios. Our Project Page: https://colinyu1.github.io/genflowrl |
| title | GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning |
| topic | Robotics Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2508.11049 |