Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.19193 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866910029275201536 |
|---|---|
| author | Bui, Hieu Gao, Ziyan Hosoda, Yuya Lee, Joo-Ho |
| author_facet | Bui, Hieu Gao, Ziyan Hosoda, Yuya Lee, Joo-Ho |
| contents | As one of the simplest non-prehensile manipulation skills, pushing has been widely studied as an effective means to rearrange objects. Existing approaches, however, typically rely on multi-step push plans composed of pre-defined pushing primitives with limited application scopes, which restrict their efficiency and versatility across different scenarios. In this work, we propose a unified pushing policy that incorporates a lightweight prompting mechanism into a flow matching policy to guide the generation of reactive, multimodal pushing actions. The visual prompt can be specified by a high-level planner, enabling the reuse of the pushing policy across a wide range of planning problems. Experimental results demonstrate that the proposed unified pushing policy not only outperforms existing baselines but also effectively serves as a low-level primitive within a VLM-guided planning framework to solve table-cleaning tasks efficiently. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_19193 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Visual Prompt Guided Unified Pushing Policy Bui, Hieu Gao, Ziyan Hosoda, Yuya Lee, Joo-Ho Robotics Artificial Intelligence As one of the simplest non-prehensile manipulation skills, pushing has been widely studied as an effective means to rearrange objects. Existing approaches, however, typically rely on multi-step push plans composed of pre-defined pushing primitives with limited application scopes, which restrict their efficiency and versatility across different scenarios. In this work, we propose a unified pushing policy that incorporates a lightweight prompting mechanism into a flow matching policy to guide the generation of reactive, multimodal pushing actions. The visual prompt can be specified by a high-level planner, enabling the reuse of the pushing policy across a wide range of planning problems. Experimental results demonstrate that the proposed unified pushing policy not only outperforms existing baselines but also effectively serves as a low-level primitive within a VLM-guided planning framework to solve table-cleaning tasks efficiently. |
| title | Visual Prompt Guided Unified Pushing Policy |
| topic | Robotics Artificial Intelligence |
| url | https://arxiv.org/abs/2602.19193 |