Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Cao, Elton, Lipson, Hod
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2604.13549
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

The conversion of 2D freehand sketches into 3D models remains a pivotal challenge in computer vision, bridging the gap between fluent sketching and CAD. Traditional monocular depth reconstruction techniques are not suitable for line drawing interpretation. We propose a generative approach by framing reconstruction as a conditional dense depth estimation task. To achieve this, we implemented a Latent Diffusion Model (LDM) with a conditioning framework to resolve the inherent ambiguities of orthographic projections. We trained our model using a dataset of over one million image-depth pairs. Our framework demonstrated robust performance across varying shape complexities, with 5.3 percent average depth error.

Similar Items