Saved in:
Bibliographic Details
Main Authors: Zhang, Shuoheng, Yuan, Yifu, Tang, Hongyao, Zheng, Yan, Yu, Qiaojun, Li, Pengyi, Huang, Guowei, Huang, Helong, Quan, Xingyue, Hao, Jianye
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.11048
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917481576136704
author Zhang, Shuoheng
Yuan, Yifu
Tang, Hongyao
Zheng, Yan
Yu, Qiaojun
Li, Pengyi
Huang, Guowei
Huang, Helong
Quan, Xingyue
Hao, Jianye
author_facet Zhang, Shuoheng
Yuan, Yifu
Tang, Hongyao
Zheng, Yan
Yu, Qiaojun
Li, Pengyi
Huang, Guowei
Huang, Helong
Quan, Xingyue
Hao, Jianye
contents Existing imitation learning methods enable robots to interact autonomously with the physical environment. However, contact-rich manipulation tasks remain a significant challenge due to complex contact dynamics that demand high-precision force feedback and control. Although recent efforts have attempted to integrate force/torque sensing into policies, how to build a simple yet effective framework that achieves robust generalization under multimodal observations remains an open question. In this paper, we propose ForceFlow, a force-aware reactive framework built upon flow matching. For contact-stage policy design, we investigate force signal fusion mechanisms and adopt an asymmetric multimodal fusion architecture that treats force as a global regulatory signal, combined with a joint prediction paradigm that enhances the policy's understanding of instantaneous force and historical information, thereby achieving deep coupling between force and motion. For task-level hierarchical decomposition, we divide manipulation into a vision-dominant approach stage (VLM-based pointing for target localization) and a touch-dominant interaction stage (force-driven contact execution), with a Vision-to-Force (V2F) handover mechanism that explicitly decouples spatial generalization from contact regulation. Experimental results across six real-world contact-rich tasks demonstrate that ForceFlow achieves a 37% success rate improvement over the strong baseline ForceVLA while maintaining significantly lower cost. Moreover, ForceFlow exhibits accurate force signal prediction and demonstrates superior performance in contact force self-regulation and zero-shot out-of-distribution (OOD) generalization.
format Preprint
id arxiv_https___arxiv_org_abs_2605_11048
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching
Zhang, Shuoheng
Yuan, Yifu
Tang, Hongyao
Zheng, Yan
Yu, Qiaojun
Li, Pengyi
Huang, Guowei
Huang, Helong
Quan, Xingyue
Hao, Jianye
Robotics
Artificial Intelligence
Existing imitation learning methods enable robots to interact autonomously with the physical environment. However, contact-rich manipulation tasks remain a significant challenge due to complex contact dynamics that demand high-precision force feedback and control. Although recent efforts have attempted to integrate force/torque sensing into policies, how to build a simple yet effective framework that achieves robust generalization under multimodal observations remains an open question. In this paper, we propose ForceFlow, a force-aware reactive framework built upon flow matching. For contact-stage policy design, we investigate force signal fusion mechanisms and adopt an asymmetric multimodal fusion architecture that treats force as a global regulatory signal, combined with a joint prediction paradigm that enhances the policy's understanding of instantaneous force and historical information, thereby achieving deep coupling between force and motion. For task-level hierarchical decomposition, we divide manipulation into a vision-dominant approach stage (VLM-based pointing for target localization) and a touch-dominant interaction stage (force-driven contact execution), with a Vision-to-Force (V2F) handover mechanism that explicitly decouples spatial generalization from contact regulation. Experimental results across six real-world contact-rich tasks demonstrate that ForceFlow achieves a 37% success rate improvement over the strong baseline ForceVLA while maintaining significantly lower cost. Moreover, ForceFlow exhibits accurate force signal prediction and demonstrates superior performance in contact force self-regulation and zero-shot out-of-distribution (OOD) generalization.
title ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching
topic Robotics
Artificial Intelligence
url https://arxiv.org/abs/2605.11048