Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Bowen, Yang, Cheng, Liu, Xuanhui
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2407.15066
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916331474911232
author	Zhang, Bowen Yang, Cheng Liu, Xuanhui
author_facet	Zhang, Bowen Yang, Cheng Liu, Xuanhui
contents	In recent years, advancements in AIGC (Artificial Intelligence Generated Content) technology have significantly enhanced the capabilities of large text-to-image models. Despite these improvements, controllable image generation remains a challenge. Current methods, such as training, forward guidance, and backward guidance, have notable limitations. The first two approaches either demand substantial computational resources or produce subpar results. The third approach depends on phenomena specific to certain model architectures, complicating its application to large-scale image generation.To address these issues, we propose a novel controllable generation framework that offers a generalized interpretation of backward guidance without relying on specific assumptions. Leveraging this framework, we introduce LSReGen, a large-scale layout-to-image method designed to generate high-quality, layout-compliant images. Experimental results show that LSReGen outperforms existing methods in the large-scale layout-to-image task, underscoring the effectiveness of our proposed framework. Our code and models will be open-sourced.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_15066
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	LSReGen: Large-Scale Regional Generator via Backward Guidance Framework Zhang, Bowen Yang, Cheng Liu, Xuanhui Computer Vision and Pattern Recognition In recent years, advancements in AIGC (Artificial Intelligence Generated Content) technology have significantly enhanced the capabilities of large text-to-image models. Despite these improvements, controllable image generation remains a challenge. Current methods, such as training, forward guidance, and backward guidance, have notable limitations. The first two approaches either demand substantial computational resources or produce subpar results. The third approach depends on phenomena specific to certain model architectures, complicating its application to large-scale image generation.To address these issues, we propose a novel controllable generation framework that offers a generalized interpretation of backward guidance without relying on specific assumptions. Leveraging this framework, we introduce LSReGen, a large-scale layout-to-image method designed to generate high-quality, layout-compliant images. Experimental results show that LSReGen outperforms existing methods in the large-scale layout-to-image task, underscoring the effectiveness of our proposed framework. Our code and models will be open-sourced.
title	LSReGen: Large-Scale Regional Generator via Backward Guidance Framework
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2407.15066

Similar Items