Saved in:
Bibliographic Details
Main Authors: Bui, Hieu, Gao, Ziyan, Hosoda, Yuya, Lee, Joo-Ho
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.19193
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910029275201536
author Bui, Hieu
Gao, Ziyan
Hosoda, Yuya
Lee, Joo-Ho
author_facet Bui, Hieu
Gao, Ziyan
Hosoda, Yuya
Lee, Joo-Ho
contents As one of the simplest non-prehensile manipulation skills, pushing has been widely studied as an effective means to rearrange objects. Existing approaches, however, typically rely on multi-step push plans composed of pre-defined pushing primitives with limited application scopes, which restrict their efficiency and versatility across different scenarios. In this work, we propose a unified pushing policy that incorporates a lightweight prompting mechanism into a flow matching policy to guide the generation of reactive, multimodal pushing actions. The visual prompt can be specified by a high-level planner, enabling the reuse of the pushing policy across a wide range of planning problems. Experimental results demonstrate that the proposed unified pushing policy not only outperforms existing baselines but also effectively serves as a low-level primitive within a VLM-guided planning framework to solve table-cleaning tasks efficiently.
format Preprint
id arxiv_https___arxiv_org_abs_2602_19193
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Visual Prompt Guided Unified Pushing Policy
Bui, Hieu
Gao, Ziyan
Hosoda, Yuya
Lee, Joo-Ho
Robotics
Artificial Intelligence
As one of the simplest non-prehensile manipulation skills, pushing has been widely studied as an effective means to rearrange objects. Existing approaches, however, typically rely on multi-step push plans composed of pre-defined pushing primitives with limited application scopes, which restrict their efficiency and versatility across different scenarios. In this work, we propose a unified pushing policy that incorporates a lightweight prompting mechanism into a flow matching policy to guide the generation of reactive, multimodal pushing actions. The visual prompt can be specified by a high-level planner, enabling the reuse of the pushing policy across a wide range of planning problems. Experimental results demonstrate that the proposed unified pushing policy not only outperforms existing baselines but also effectively serves as a low-level primitive within a VLM-guided planning framework to solve table-cleaning tasks efficiently.
title Visual Prompt Guided Unified Pushing Policy
topic Robotics
Artificial Intelligence
url https://arxiv.org/abs/2602.19193