Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ohanyan, Marianna, Manukyan, Hayk, Wang, Zhangyang, Navasardyan, Shant, Shi, Humphrey
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2406.04032
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866907928602083328
author	Ohanyan, Marianna Manukyan, Hayk Wang, Zhangyang Navasardyan, Shant Shi, Humphrey
author_facet	Ohanyan, Marianna Manukyan, Hayk Wang, Zhangyang Navasardyan, Shant Shi, Humphrey
contents	We present Zero-Painter, a novel training-free framework for layout-conditional text-to-image synthesis that facilitates the creation of detailed and controlled imagery from textual prompts. Our method utilizes object masks and individual descriptions, coupled with a global text prompt, to generate images with high fidelity. Zero-Painter employs a two-stage process involving our novel Prompt-Adjusted Cross-Attention (PACA) and Region-Grouped Cross-Attention (ReGCA) blocks, ensuring precise alignment of generated objects with textual prompts and mask shapes. Our extensive experiments demonstrate that Zero-Painter surpasses current state-of-the-art methods in preserving textual details and adhering to mask shapes.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_04032
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis Ohanyan, Marianna Manukyan, Hayk Wang, Zhangyang Navasardyan, Shant Shi, Humphrey Computer Vision and Pattern Recognition We present Zero-Painter, a novel training-free framework for layout-conditional text-to-image synthesis that facilitates the creation of detailed and controlled imagery from textual prompts. Our method utilizes object masks and individual descriptions, coupled with a global text prompt, to generate images with high fidelity. Zero-Painter employs a two-stage process involving our novel Prompt-Adjusted Cross-Attention (PACA) and Region-Grouped Cross-Attention (ReGCA) blocks, ensuring precise alignment of generated objects with textual prompts and mask shapes. Our extensive experiments demonstrate that Zero-Painter surpasses current state-of-the-art methods in preserving textual details and adhering to mask shapes.
title	Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2406.04032

Similar Items