Saved in:
Bibliographic Details
Main Authors: Cheng, Zhixin, Chen, Yujia, Tao, Xujing, Liao, Bohao, Yin, Xiaotian, Yin, Baoqun, Zhang, Tianzhu
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.07607
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910253257326592
author Cheng, Zhixin
Chen, Yujia
Tao, Xujing
Liao, Bohao
Yin, Xiaotian
Yin, Baoqun
Zhang, Tianzhu
author_facet Cheng, Zhixin
Chen, Yujia
Tao, Xujing
Liao, Bohao
Yin, Xiaotian
Yin, Baoqun
Zhang, Tianzhu
contents Image-to-point cloud registration is often challenged by viewpoint changes, cross-modal discrepancies, and repetitive textures, which induce scale ambiguity and consequently lead to erroneous correspondences. Recent detection-free methods alleviate this issue by leveraging multi-scale features and transformer-based interactions. However, they still suffer from attention drift across layers and intra-scale inconsistencies, hindering precise registration. Inspired by human behavior, we propose a ``Focus--Sweep'' paradigm and develop a Hierarchical Focus--Sweep Interaction Module within an SSM-based framework to enhance multi-level cross-modal feature association. In addition, we introduce a Dynamic Layer Allocation Strategy that adaptively determines the iteration depth to better exploit geometric constraints and improve matching robustness. Extensive experiments and ablations on two benchmarks, RGB-D Scenes V2 and 7-Scenes, demonstrate that our approach achieves state-of-the-art performance.
format Preprint
id arxiv_https___arxiv_org_abs_2605_07607
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle FS-I2P:A Hierarchical Focus-Sweep Registration Network with Dynamically Allocated Depth
Cheng, Zhixin
Chen, Yujia
Tao, Xujing
Liao, Bohao
Yin, Xiaotian
Yin, Baoqun
Zhang, Tianzhu
Computer Vision and Pattern Recognition
Image-to-point cloud registration is often challenged by viewpoint changes, cross-modal discrepancies, and repetitive textures, which induce scale ambiguity and consequently lead to erroneous correspondences. Recent detection-free methods alleviate this issue by leveraging multi-scale features and transformer-based interactions. However, they still suffer from attention drift across layers and intra-scale inconsistencies, hindering precise registration. Inspired by human behavior, we propose a ``Focus--Sweep'' paradigm and develop a Hierarchical Focus--Sweep Interaction Module within an SSM-based framework to enhance multi-level cross-modal feature association. In addition, we introduce a Dynamic Layer Allocation Strategy that adaptively determines the iteration depth to better exploit geometric constraints and improve matching robustness. Extensive experiments and ablations on two benchmarks, RGB-D Scenes V2 and 7-Scenes, demonstrate that our approach achieves state-of-the-art performance.
title FS-I2P:A Hierarchical Focus-Sweep Registration Network with Dynamically Allocated Depth
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2605.07607