Saved in:
Bibliographic Details
Main Authors: Gholami, Peyman, Xiao, Robert
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2405.00313
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912643014459392
author Gholami, Peyman
Xiao, Robert
author_facet Gholami, Peyman
Xiao, Robert
contents Denoising diffusion models have emerged as powerful tools for image manipulation, yet interactive, localized editing workflows remain underdeveloped. We introduce Layered Diffusion Brushes (LDB), a novel training-free framework that enables interactive, layer-based editing using standard diffusion models. LDB defines each "layer" as a self-contained set of parameters guiding the generative process, enabling independent, non-destructive, and fine-grained prompt-guided edits, even in overlapping regions. LDB leverages a unique intermediate latent caching approach to reduce each edit to only a few denoising steps, achieving 140~ms per edit on consumer GPUs. An editor implementing LDB, incorporating familiar layer concepts, was evaluated via user study and quantitative metrics. Results demonstrate LDB's superior speed alongside comparable or improved image quality, background preservation, and edit fidelity relative to state-of-the-art methods across various sequential image manipulation tasks. The findings highlight LDB's ability to significantly enhance creative workflows by providing an intuitive and efficient approach to diffusion-based image editing and its potential for expansion into related subdomains, such as video editing.
format Preprint
id arxiv_https___arxiv_org_abs_2405_00313
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Streamlining Image Editing with Layered Diffusion Brushes
Gholami, Peyman
Xiao, Robert
Computer Vision and Pattern Recognition
Denoising diffusion models have emerged as powerful tools for image manipulation, yet interactive, localized editing workflows remain underdeveloped. We introduce Layered Diffusion Brushes (LDB), a novel training-free framework that enables interactive, layer-based editing using standard diffusion models. LDB defines each "layer" as a self-contained set of parameters guiding the generative process, enabling independent, non-destructive, and fine-grained prompt-guided edits, even in overlapping regions. LDB leverages a unique intermediate latent caching approach to reduce each edit to only a few denoising steps, achieving 140~ms per edit on consumer GPUs. An editor implementing LDB, incorporating familiar layer concepts, was evaluated via user study and quantitative metrics. Results demonstrate LDB's superior speed alongside comparable or improved image quality, background preservation, and edit fidelity relative to state-of-the-art methods across various sequential image manipulation tasks. The findings highlight LDB's ability to significantly enhance creative workflows by providing an intuitive and efficient approach to diffusion-based image editing and its potential for expansion into related subdomains, such as video editing.
title Streamlining Image Editing with Layered Diffusion Brushes
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2405.00313