Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Shuoheng, Yuan, Yifu, Tang, Hongyao, Zheng, Yan, Yu, Qiaojun, Li, Pengyi, Huang, Guowei, Huang, Helong, Quan, Xingyue, Hao, Jianye
Format:	Preprint
Published:	2026
Subjects:	Robotics Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.11048
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917481576136704
author	Zhang, Shuoheng Yuan, Yifu Tang, Hongyao Zheng, Yan Yu, Qiaojun Li, Pengyi Huang, Guowei Huang, Helong Quan, Xingyue Hao, Jianye
author_facet	Zhang, Shuoheng Yuan, Yifu Tang, Hongyao Zheng, Yan Yu, Qiaojun Li, Pengyi Huang, Guowei Huang, Helong Quan, Xingyue Hao, Jianye
contents	Existing imitation learning methods enable robots to interact autonomously with the physical environment. However, contact-rich manipulation tasks remain a significant challenge due to complex contact dynamics that demand high-precision force feedback and control. Although recent efforts have attempted to integrate force/torque sensing into policies, how to build a simple yet effective framework that achieves robust generalization under multimodal observations remains an open question. In this paper, we propose ForceFlow, a force-aware reactive framework built upon flow matching. For contact-stage policy design, we investigate force signal fusion mechanisms and adopt an asymmetric multimodal fusion architecture that treats force as a global regulatory signal, combined with a joint prediction paradigm that enhances the policy's understanding of instantaneous force and historical information, thereby achieving deep coupling between force and motion. For task-level hierarchical decomposition, we divide manipulation into a vision-dominant approach stage (VLM-based pointing for target localization) and a touch-dominant interaction stage (force-driven contact execution), with a Vision-to-Force (V2F) handover mechanism that explicitly decouples spatial generalization from contact regulation. Experimental results across six real-world contact-rich tasks demonstrate that ForceFlow achieves a 37% success rate improvement over the strong baseline ForceVLA while maintaining significantly lower cost. Moreover, ForceFlow exhibits accurate force signal prediction and demonstrates superior performance in contact force self-regulation and zero-shot out-of-distribution (OOD) generalization.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_11048
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching Zhang, Shuoheng Yuan, Yifu Tang, Hongyao Zheng, Yan Yu, Qiaojun Li, Pengyi Huang, Guowei Huang, Helong Quan, Xingyue Hao, Jianye Robotics Artificial Intelligence Existing imitation learning methods enable robots to interact autonomously with the physical environment. However, contact-rich manipulation tasks remain a significant challenge due to complex contact dynamics that demand high-precision force feedback and control. Although recent efforts have attempted to integrate force/torque sensing into policies, how to build a simple yet effective framework that achieves robust generalization under multimodal observations remains an open question. In this paper, we propose ForceFlow, a force-aware reactive framework built upon flow matching. For contact-stage policy design, we investigate force signal fusion mechanisms and adopt an asymmetric multimodal fusion architecture that treats force as a global regulatory signal, combined with a joint prediction paradigm that enhances the policy's understanding of instantaneous force and historical information, thereby achieving deep coupling between force and motion. For task-level hierarchical decomposition, we divide manipulation into a vision-dominant approach stage (VLM-based pointing for target localization) and a touch-dominant interaction stage (force-driven contact execution), with a Vision-to-Force (V2F) handover mechanism that explicitly decouples spatial generalization from contact regulation. Experimental results across six real-world contact-rich tasks demonstrate that ForceFlow achieves a 37% success rate improvement over the strong baseline ForceVLA while maintaining significantly lower cost. Moreover, ForceFlow exhibits accurate force signal prediction and demonstrates superior performance in contact force self-regulation and zero-shot out-of-distribution (OOD) generalization.
title	ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching
topic	Robotics Artificial Intelligence
url	https://arxiv.org/abs/2605.11048

Similar Items