Saved in:
Bibliographic Details
Main Authors: Zhou, Shunkai, Yan, Zike, Xue, Fei, Wu, Dong, Deng, Yuchen, Zha, Hongbin
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.09480
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913021727604736
author Zhou, Shunkai
Yan, Zike
Xue, Fei
Wu, Dong
Deng, Yuchen
Zha, Hongbin
author_facet Zhou, Shunkai
Yan, Zike
Xue, Fei
Wu, Dong
Deng, Yuchen
Zha, Hongbin
contents We present Online3R, a new sequential reconstruction framework that is capable of adapting to new scenes through online learning, effectively resolving inconsistency issues. Specifically, we introduce a set of learnable lightweight visual prompts into a pretrained, frozen geometry foundation model to capture the knowledge of new environments while preserving the fundamental capability of the foundation model for geometry prediction. To solve the problems of missing groundtruth and the requirement of high efficiency when updating these visual prompts at test time, we introduce a local-global self-supervised learning strategy by enforcing the local and global consistency constraints on predictions. The local consistency constraints are conducted on intermediate and previously local fused results, enabling the model to be trained with high-quality pseudo groundtruth signals; the global consistency constraints are operated on sparse keyframes spanning long distances rather than per frame, allowing the model to learn from a consistent prediction over a long trajectory in an efficient way. Our experiments demonstrate that Online3R outperforms previous state-of-the-art methods on various benchmarks. Project page: https://shunkaizhou.github.io/online3r-1.0/
format Preprint
id arxiv_https___arxiv_org_abs_2604_09480
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Online3R: Online Learning for Consistent Sequential Reconstruction Based on Geometry Foundation Model
Zhou, Shunkai
Yan, Zike
Xue, Fei
Wu, Dong
Deng, Yuchen
Zha, Hongbin
Computer Vision and Pattern Recognition
We present Online3R, a new sequential reconstruction framework that is capable of adapting to new scenes through online learning, effectively resolving inconsistency issues. Specifically, we introduce a set of learnable lightweight visual prompts into a pretrained, frozen geometry foundation model to capture the knowledge of new environments while preserving the fundamental capability of the foundation model for geometry prediction. To solve the problems of missing groundtruth and the requirement of high efficiency when updating these visual prompts at test time, we introduce a local-global self-supervised learning strategy by enforcing the local and global consistency constraints on predictions. The local consistency constraints are conducted on intermediate and previously local fused results, enabling the model to be trained with high-quality pseudo groundtruth signals; the global consistency constraints are operated on sparse keyframes spanning long distances rather than per frame, allowing the model to learn from a consistent prediction over a long trajectory in an efficient way. Our experiments demonstrate that Online3R outperforms previous state-of-the-art methods on various benchmarks. Project page: https://shunkaizhou.github.io/online3r-1.0/
title Online3R: Online Learning for Consistent Sequential Reconstruction Based on Geometry Foundation Model
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2604.09480