Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Alharbi, Yazeed, Wonka, Peter
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2403.12585
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909141993259008
author	Alharbi, Yazeed Wonka, Peter
author_facet	Alharbi, Yazeed Wonka, Peter
contents	We present a novel, training-free approach for textual editing of real images using diffusion models. Unlike prior methods that rely on computationally expensive finetuning, our approach leverages LAtent SPatial Alignment (LASPA) to efficiently preserve image details. We demonstrate how the diffusion process is amenable to spatial guidance using a reference image, leading to semantically coherent edits. This eliminates the need for complex optimization and costly model finetuning, resulting in significantly faster editing compared to previous methods. Additionally, our method avoids the storage requirements associated with large finetuned models. These advantages make our approach particularly well-suited for editing on mobile devices and applications demanding rapid response times. While simple and fast, our method achieves 62-71\% preference in a user-study and significantly better model-based editing strength and image preservation scores.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_12585
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	LASPA: Latent Spatial Alignment for Fast Training-free Single Image Editing Alharbi, Yazeed Wonka, Peter Computer Vision and Pattern Recognition We present a novel, training-free approach for textual editing of real images using diffusion models. Unlike prior methods that rely on computationally expensive finetuning, our approach leverages LAtent SPatial Alignment (LASPA) to efficiently preserve image details. We demonstrate how the diffusion process is amenable to spatial guidance using a reference image, leading to semantically coherent edits. This eliminates the need for complex optimization and costly model finetuning, resulting in significantly faster editing compared to previous methods. Additionally, our method avoids the storage requirements associated with large finetuned models. These advantages make our approach particularly well-suited for editing on mobile devices and applications demanding rapid response times. While simple and fast, our method achieves 62-71\% preference in a user-study and significantly better model-based editing strength and image preservation scores.
title	LASPA: Latent Spatial Alignment for Fast Training-free Single Image Editing
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2403.12585

Similar Items