Saved in:
Bibliographic Details
Main Authors: Hirschorn, Or, Sela, Omer, Huberman-Spiegelglas, Inbar, Efrat, Netalee, Alshan, Eli, Ideses, Ianir, Devernay, Frederic, Zvik, Yochai, Fritz, Lior
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2512.09923
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Radiance field representations have recently been explored in the latent space of VAEs that are commonly used by diffusion models. This direction offers efficient rendering and seamless integration with diffusion-based pipelines. However, these methods face a fundamental limitation: The VAE latent space lacks multi-view consistency, leading to blurred textures and missing details during 3D reconstruction. Existing approaches attempt to address this by fine-tuning the VAE, at the cost of reconstruction quality, or by relying on pre-trained diffusion models to recover fine-grained details, at the risk of some hallucinations. We present Splatent, a diffusion-based enhancement framework designed to operate on top of 3D Gaussian Splatting (3DGS) in the latent space of VAEs. Our key insight departs from the conventional 3D-centric view: rather than reconstructing fine-grained details in 3D space, we recover them in 2D from input views through multi-view attention mechanisms. This approach preserves the reconstruction quality of pretrained VAEs while achieving faithful detail recovery. Evaluated across multiple benchmarks, Splatent establishes a new state-of-the-art for VAE latent radiance field reconstruction. We further demonstrate that integrating our method with existing feed-forward frameworks, consistently improves detail preservation, opening new possibilities for high-quality sparse-view 3D reconstruction. Code is available on our project page: https://orhir.github.io/Splatent/