Saved in:
Bibliographic Details
Main Authors: Lee, Jihoon, Min, Yunhong, Kim, Hwidong, Ahn, Sangtae
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.04962
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • In recent years, there has been a significant focus on research related to text-guided image inpainting. However, the task remains challenging due to several constraints, such as ensuring alignment between the image and the text, and maintaining consistency in distribution between corrupted and uncorrupted regions. In this paper, thus, we propose a dual affine transformation generative adversarial network (DAFT-GAN) to maintain the semantic consistency for text-guided inpainting. DAFT-GAN integrates two affine transformation networks to combine text and image features gradually for each decoding block. Moreover, we minimize information leakage of uncorrupted features for fine-grained image generation by encoding corrupted and uncorrupted regions of the masked image separately. Our proposed model outperforms the existing GAN-based models in both qualitative and quantitative assessments with three benchmark datasets (MS-COCO, CUB, and Oxford) for text-guided image inpainting.