Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Cao, Tianshi, Rakotosaona, Marie-Julie, Poole, Ben, Tombari, Federico, Niemeyer, Michael
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2507.00916
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908429512081408
author	Cao, Tianshi Rakotosaona, Marie-Julie Poole, Ben Tombari, Federico Niemeyer, Michael
author_facet	Cao, Tianshi Rakotosaona, Marie-Julie Poole, Ben Tombari, Federico Niemeyer, Michael
contents	We present Image2GS, a novel approach that addresses the challenging problem of reconstructing photorealistic 3D scenes from a single image by focusing specifically on the image-to-3D lifting component of the reconstruction process. By decoupling the lifting problem (converting an image to a 3D model representing what is visible) from the completion problem (hallucinating content not present in the input), we create a more deterministic task suitable for discriminative models. Our method employs visibility masks derived from optimized 3D Gaussian splats to exclude areas not visible from the source view during training. This masked training strategy significantly improves reconstruction quality in visible regions compared to strong baselines. Notably, despite being trained only on masked regions, Image2GS remains competitive with state-of-the-art discriminative models trained on full target images when evaluated on complete scenes. Our findings highlight the fundamental struggle discriminative models face when fitting unseen regions and demonstrate the advantages of addressing image-to-3D lifting as a distinct problem with specialized techniques.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_00916
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Masks make discriminative models great again! Cao, Tianshi Rakotosaona, Marie-Julie Poole, Ben Tombari, Federico Niemeyer, Michael Computer Vision and Pattern Recognition We present Image2GS, a novel approach that addresses the challenging problem of reconstructing photorealistic 3D scenes from a single image by focusing specifically on the image-to-3D lifting component of the reconstruction process. By decoupling the lifting problem (converting an image to a 3D model representing what is visible) from the completion problem (hallucinating content not present in the input), we create a more deterministic task suitable for discriminative models. Our method employs visibility masks derived from optimized 3D Gaussian splats to exclude areas not visible from the source view during training. This masked training strategy significantly improves reconstruction quality in visible regions compared to strong baselines. Notably, despite being trained only on masked regions, Image2GS remains competitive with state-of-the-art discriminative models trained on full target images when evaluated on complete scenes. Our findings highlight the fundamental struggle discriminative models face when fitting unseen regions and demonstrate the advantages of addressing image-to-3D lifting as a distinct problem with specialized techniques.
title	Masks make discriminative models great again!
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2507.00916

Similar Items