Saved in:
Bibliographic Details
Main Authors: Ohanyan, Marianna, Manukyan, Hayk, Wang, Zhangyang, Navasardyan, Shant, Shi, Humphrey
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.04032
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866907928602083328
author Ohanyan, Marianna
Manukyan, Hayk
Wang, Zhangyang
Navasardyan, Shant
Shi, Humphrey
author_facet Ohanyan, Marianna
Manukyan, Hayk
Wang, Zhangyang
Navasardyan, Shant
Shi, Humphrey
contents We present Zero-Painter, a novel training-free framework for layout-conditional text-to-image synthesis that facilitates the creation of detailed and controlled imagery from textual prompts. Our method utilizes object masks and individual descriptions, coupled with a global text prompt, to generate images with high fidelity. Zero-Painter employs a two-stage process involving our novel Prompt-Adjusted Cross-Attention (PACA) and Region-Grouped Cross-Attention (ReGCA) blocks, ensuring precise alignment of generated objects with textual prompts and mask shapes. Our extensive experiments demonstrate that Zero-Painter surpasses current state-of-the-art methods in preserving textual details and adhering to mask shapes.
format Preprint
id arxiv_https___arxiv_org_abs_2406_04032
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Ohanyan, Marianna
Manukyan, Hayk
Wang, Zhangyang
Navasardyan, Shant
Shi, Humphrey
Computer Vision and Pattern Recognition
We present Zero-Painter, a novel training-free framework for layout-conditional text-to-image synthesis that facilitates the creation of detailed and controlled imagery from textual prompts. Our method utilizes object masks and individual descriptions, coupled with a global text prompt, to generate images with high fidelity. Zero-Painter employs a two-stage process involving our novel Prompt-Adjusted Cross-Attention (PACA) and Region-Grouped Cross-Attention (ReGCA) blocks, ensuring precise alignment of generated objects with textual prompts and mask shapes. Our extensive experiments demonstrate that Zero-Painter surpasses current state-of-the-art methods in preserving textual details and adhering to mask shapes.
title Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2406.04032