Saved in:
Bibliographic Details
Main Authors: Tao, Keda, Gu, Jinjin, Zhang, Yulun, Wang, Xiucheng, Cheng, Nan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.04161
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912335914860544
author Tao, Keda
Gu, Jinjin
Zhang, Yulun
Wang, Xiucheng
Cheng, Nan
author_facet Tao, Keda
Gu, Jinjin
Zhang, Yulun
Wang, Xiucheng
Cheng, Nan
contents We introduce a novel Multi-modal Guided Real-World Face Restoration (MGFR) technique designed to improve the quality of facial image restoration from low-quality inputs. Leveraging a blend of attribute text prompts, high-quality reference images, and identity information, MGFR can mitigate the generation of false facial attributes and identities often associated with generative face restoration methods. By incorporating a dual-control adapter and a two-stage training strategy, our method effectively utilizes multi-modal prior information for targeted restoration tasks. We also present the Reface-HQ dataset, comprising over 21,000 high-resolution facial images across 4800 identities, to address the need for reference face training images. Our approach achieves superior visual quality in restoring facial details under severe degradation and allows for controlled restoration processes, enhancing the accuracy of identity preservation and attribute correction. Including negative quality samples and attribute prompts in the training further refines the model's ability to generate detailed and perceptually accurate images.
format Preprint
id arxiv_https___arxiv_org_abs_2410_04161
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model
Tao, Keda
Gu, Jinjin
Zhang, Yulun
Wang, Xiucheng
Cheng, Nan
Computer Vision and Pattern Recognition
We introduce a novel Multi-modal Guided Real-World Face Restoration (MGFR) technique designed to improve the quality of facial image restoration from low-quality inputs. Leveraging a blend of attribute text prompts, high-quality reference images, and identity information, MGFR can mitigate the generation of false facial attributes and identities often associated with generative face restoration methods. By incorporating a dual-control adapter and a two-stage training strategy, our method effectively utilizes multi-modal prior information for targeted restoration tasks. We also present the Reface-HQ dataset, comprising over 21,000 high-resolution facial images across 4800 identities, to address the need for reference face training images. Our approach achieves superior visual quality in restoring facial details under severe degradation and allows for controlled restoration processes, enhancing the accuracy of identity preservation and attribute correction. Including negative quality samples and attribute prompts in the training further refines the model's ability to generate detailed and perceptually accurate images.
title Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2410.04161