Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.09480 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866913021727604736 |
|---|---|
| author | Zhou, Shunkai Yan, Zike Xue, Fei Wu, Dong Deng, Yuchen Zha, Hongbin |
| author_facet | Zhou, Shunkai Yan, Zike Xue, Fei Wu, Dong Deng, Yuchen Zha, Hongbin |
| contents | We present Online3R, a new sequential reconstruction framework that is capable of adapting to new scenes through online learning, effectively resolving inconsistency issues. Specifically, we introduce a set of learnable lightweight visual prompts into a pretrained, frozen geometry foundation model to capture the knowledge of new environments while preserving the fundamental capability of the foundation model for geometry prediction. To solve the problems of missing groundtruth and the requirement of high efficiency when updating these visual prompts at test time, we introduce a local-global self-supervised learning strategy by enforcing the local and global consistency constraints on predictions. The local consistency constraints are conducted on intermediate and previously local fused results, enabling the model to be trained with high-quality pseudo groundtruth signals; the global consistency constraints are operated on sparse keyframes spanning long distances rather than per frame, allowing the model to learn from a consistent prediction over a long trajectory in an efficient way. Our experiments demonstrate that Online3R outperforms previous state-of-the-art methods on various benchmarks. Project page: https://shunkaizhou.github.io/online3r-1.0/ |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2604_09480 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Online3R: Online Learning for Consistent Sequential Reconstruction Based on Geometry Foundation Model Zhou, Shunkai Yan, Zike Xue, Fei Wu, Dong Deng, Yuchen Zha, Hongbin Computer Vision and Pattern Recognition We present Online3R, a new sequential reconstruction framework that is capable of adapting to new scenes through online learning, effectively resolving inconsistency issues. Specifically, we introduce a set of learnable lightweight visual prompts into a pretrained, frozen geometry foundation model to capture the knowledge of new environments while preserving the fundamental capability of the foundation model for geometry prediction. To solve the problems of missing groundtruth and the requirement of high efficiency when updating these visual prompts at test time, we introduce a local-global self-supervised learning strategy by enforcing the local and global consistency constraints on predictions. The local consistency constraints are conducted on intermediate and previously local fused results, enabling the model to be trained with high-quality pseudo groundtruth signals; the global consistency constraints are operated on sparse keyframes spanning long distances rather than per frame, allowing the model to learn from a consistent prediction over a long trajectory in an efficient way. Our experiments demonstrate that Online3R outperforms previous state-of-the-art methods on various benchmarks. Project page: https://shunkaizhou.github.io/online3r-1.0/ |
| title | Online3R: Online Learning for Consistent Sequential Reconstruction Based on Geometry Foundation Model |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2604.09480 |