Saved in:
Bibliographic Details
Main Authors: Cao, Tianshi, Rakotosaona, Marie-Julie, Poole, Ben, Tombari, Federico, Niemeyer, Michael
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.00916
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908429512081408
author Cao, Tianshi
Rakotosaona, Marie-Julie
Poole, Ben
Tombari, Federico
Niemeyer, Michael
author_facet Cao, Tianshi
Rakotosaona, Marie-Julie
Poole, Ben
Tombari, Federico
Niemeyer, Michael
contents We present Image2GS, a novel approach that addresses the challenging problem of reconstructing photorealistic 3D scenes from a single image by focusing specifically on the image-to-3D lifting component of the reconstruction process. By decoupling the lifting problem (converting an image to a 3D model representing what is visible) from the completion problem (hallucinating content not present in the input), we create a more deterministic task suitable for discriminative models. Our method employs visibility masks derived from optimized 3D Gaussian splats to exclude areas not visible from the source view during training. This masked training strategy significantly improves reconstruction quality in visible regions compared to strong baselines. Notably, despite being trained only on masked regions, Image2GS remains competitive with state-of-the-art discriminative models trained on full target images when evaluated on complete scenes. Our findings highlight the fundamental struggle discriminative models face when fitting unseen regions and demonstrate the advantages of addressing image-to-3D lifting as a distinct problem with specialized techniques.
format Preprint
id arxiv_https___arxiv_org_abs_2507_00916
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Masks make discriminative models great again!
Cao, Tianshi
Rakotosaona, Marie-Julie
Poole, Ben
Tombari, Federico
Niemeyer, Michael
Computer Vision and Pattern Recognition
We present Image2GS, a novel approach that addresses the challenging problem of reconstructing photorealistic 3D scenes from a single image by focusing specifically on the image-to-3D lifting component of the reconstruction process. By decoupling the lifting problem (converting an image to a 3D model representing what is visible) from the completion problem (hallucinating content not present in the input), we create a more deterministic task suitable for discriminative models. Our method employs visibility masks derived from optimized 3D Gaussian splats to exclude areas not visible from the source view during training. This masked training strategy significantly improves reconstruction quality in visible regions compared to strong baselines. Notably, despite being trained only on masked regions, Image2GS remains competitive with state-of-the-art discriminative models trained on full target images when evaluated on complete scenes. Our findings highlight the fundamental struggle discriminative models face when fitting unseen regions and demonstrate the advantages of addressing image-to-3D lifting as a distinct problem with specialized techniques.
title Masks make discriminative models great again!
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2507.00916