Saved in:
Bibliographic Details
Main Authors: Dong, Jiacheng, Li, Huan, Zhou, Sicheng, Hu, Wenhao, Xu, Weili, Wang, Yan
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.15330
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917347281862656
author Dong, Jiacheng
Li, Huan
Zhou, Sicheng
Hu, Wenhao
Xu, Weili
Wang, Yan
author_facet Dong, Jiacheng
Li, Huan
Zhou, Sicheng
Hu, Wenhao
Xu, Weili
Wang, Yan
contents Reconstruction is a fundamental task in 3D vision and a fundamental capability for spatial intelligence. Particularly, streaming 3D reconstruction is central to real-time spatial perception, yet existing recurrent online models often suffer from progressive degradation on long sequences due to state drift and forgetting, motivating inference-time remedies. We present MeMix, a training-free, plug-and-play module that improves streaming reconstruction by recasting the recurrent state into a Memory Mixture. MeMix partitions the state into multiple independent memory patches and updates only the least-aligned memory patches while exactly preserving others. This selective update mitigates catastrophic forgetting while retaining $O(1)$ inference memory, and requires no fine-tuning or additional learnable parameters, making it directly applicable to existing recurrent reconstruction models. Across standard benchmarks (ScanNet, 7-Scenes, KITTI, etc.), under identical backbones and inference settings, MeMix reduces reconstruction completeness error by 15.3% on average (up to 40.0%) across 300--500 frame streams on 7-Scenes. The code is available at https://dongjiacheng06.github.io/MeMix/
format Preprint
id arxiv_https___arxiv_org_abs_2603_15330
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle MeMix: Writing Less, Remembering More for Streaming 3D Reconstruction
Dong, Jiacheng
Li, Huan
Zhou, Sicheng
Hu, Wenhao
Xu, Weili
Wang, Yan
Computer Vision and Pattern Recognition
Reconstruction is a fundamental task in 3D vision and a fundamental capability for spatial intelligence. Particularly, streaming 3D reconstruction is central to real-time spatial perception, yet existing recurrent online models often suffer from progressive degradation on long sequences due to state drift and forgetting, motivating inference-time remedies. We present MeMix, a training-free, plug-and-play module that improves streaming reconstruction by recasting the recurrent state into a Memory Mixture. MeMix partitions the state into multiple independent memory patches and updates only the least-aligned memory patches while exactly preserving others. This selective update mitigates catastrophic forgetting while retaining $O(1)$ inference memory, and requires no fine-tuning or additional learnable parameters, making it directly applicable to existing recurrent reconstruction models. Across standard benchmarks (ScanNet, 7-Scenes, KITTI, etc.), under identical backbones and inference settings, MeMix reduces reconstruction completeness error by 15.3% on average (up to 40.0%) across 300--500 frame streams on 7-Scenes. The code is available at https://dongjiacheng06.github.io/MeMix/
title MeMix: Writing Less, Remembering More for Streaming 3D Reconstruction
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2603.15330