Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Laguna, Sonia, Garcia-Garcia, Alberto, Rakotosaona, Marie-Julie, Moschoglou, Stylianos, Helminger, Leonhard, Orts-Escolano, Sergio
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2504.09328
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909578509156352
author	Laguna, Sonia Garcia-Garcia, Alberto Rakotosaona, Marie-Julie Moschoglou, Stylianos Helminger, Leonhard Orts-Escolano, Sergio
author_facet	Laguna, Sonia Garcia-Garcia, Alberto Rakotosaona, Marie-Julie Moschoglou, Stylianos Helminger, Leonhard Orts-Escolano, Sergio
contents	Modern machine learning models for scene understanding, such as depth estimation and object tracking, rely on large, high-quality datasets that mimic real-world deployment scenarios. To address data scarcity, we propose an end-to-end system for synthetic data generation for scalable, high-quality, and customizable 3D indoor scenes. By integrating and adapting text-to-image and multi-view diffusion models with Neural Radiance Field-based meshing, this system generates highfidelity 3D object assets from text prompts and incorporates them into pre-defined floor plans using a rendering tool. By introducing novel loss functions and training strategies into existing methods, the system supports on-demand scene generation, aiming to alleviate the scarcity of current available data, generally manually crafted by artists. This system advances the role of synthetic data in addressing machine learning training limitations, enabling more robust and generalizable models for real-world applications.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_09328
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Text To 3D Object Generation For Scalable Room Assembly Laguna, Sonia Garcia-Garcia, Alberto Rakotosaona, Marie-Julie Moschoglou, Stylianos Helminger, Leonhard Orts-Escolano, Sergio Computer Vision and Pattern Recognition Machine Learning Modern machine learning models for scene understanding, such as depth estimation and object tracking, rely on large, high-quality datasets that mimic real-world deployment scenarios. To address data scarcity, we propose an end-to-end system for synthetic data generation for scalable, high-quality, and customizable 3D indoor scenes. By integrating and adapting text-to-image and multi-view diffusion models with Neural Radiance Field-based meshing, this system generates highfidelity 3D object assets from text prompts and incorporates them into pre-defined floor plans using a rendering tool. By introducing novel loss functions and training strategies into existing methods, the system supports on-demand scene generation, aiming to alleviate the scarcity of current available data, generally manually crafted by artists. This system advances the role of synthetic data in addressing machine learning training limitations, enabling more robust and generalizable models for real-world applications.
title	Text To 3D Object Generation For Scalable Room Assembly
topic	Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2504.09328

Similar Items