Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Shen, Fei, Ye, Hu, Zhang, Jun, Wang, Cong, Han, Xiao, Yang, Wei
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2310.06313
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916489570811904
author	Shen, Fei Ye, Hu Zhang, Jun Wang, Cong Han, Xiao Yang, Wei
author_facet	Shen, Fei Ye, Hu Zhang, Jun Wang, Cong Han, Xiao Yang, Wei
contents	Recent work has showcased the significant potential of diffusion models in pose-guided person image synthesis. However, owing to the inconsistency in pose between the source and target images, synthesizing an image with a distinct pose, relying exclusively on the source image and target pose information, remains a formidable challenge. This paper presents Progressive Conditional Diffusion Models (PCDMs) that incrementally bridge the gap between person images under the target and source poses through three stages. Specifically, in the first stage, we design a simple prior conditional diffusion model that predicts the global features of the target image by mining the global alignment relationship between pose coordinates and image appearance. Then, the second stage establishes a dense correspondence between the source and target images using the global features from the previous stage, and an inpainting conditional diffusion model is proposed to further align and enhance the contextual features, generating a coarse-grained person image. In the third stage, we propose a refining conditional diffusion model to utilize the coarsely generated image from the previous stage as a condition, achieving texture restoration and enhancing fine-detail consistency. The three-stage PCDMs work progressively to generate the final high-quality and high-fidelity synthesized image. Both qualitative and quantitative results demonstrate the consistency and photorealism of our proposed PCDMs under challenging scenarios.The code and model will be available at https://github.com/tencent-ailab/PCDMs.
format	Preprint
id	arxiv_https___arxiv_org_abs_2310_06313
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models Shen, Fei Ye, Hu Zhang, Jun Wang, Cong Han, Xiao Yang, Wei Computer Vision and Pattern Recognition Recent work has showcased the significant potential of diffusion models in pose-guided person image synthesis. However, owing to the inconsistency in pose between the source and target images, synthesizing an image with a distinct pose, relying exclusively on the source image and target pose information, remains a formidable challenge. This paper presents Progressive Conditional Diffusion Models (PCDMs) that incrementally bridge the gap between person images under the target and source poses through three stages. Specifically, in the first stage, we design a simple prior conditional diffusion model that predicts the global features of the target image by mining the global alignment relationship between pose coordinates and image appearance. Then, the second stage establishes a dense correspondence between the source and target images using the global features from the previous stage, and an inpainting conditional diffusion model is proposed to further align and enhance the contextual features, generating a coarse-grained person image. In the third stage, we propose a refining conditional diffusion model to utilize the coarsely generated image from the previous stage as a condition, achieving texture restoration and enhancing fine-detail consistency. The three-stage PCDMs work progressively to generate the final high-quality and high-fidelity synthesized image. Both qualitative and quantitative results demonstrate the consistency and photorealism of our proposed PCDMs under challenging scenarios.The code and model will be available at https://github.com/tencent-ailab/PCDMs.
title	Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2310.06313

Similar Items