Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Shang, Shuyao, Shan, Zhengyang, Liu, Guangxing, Wang, LunQian, Wang, XingHua, Zhang, Zekai, Zhang, Jinglin
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2303.08714
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909090429534208
author	Shang, Shuyao Shan, Zhengyang Liu, Guangxing Wang, LunQian Wang, XingHua Zhang, Zekai Zhang, Jinglin
author_facet	Shang, Shuyao Shan, Zhengyang Liu, Guangxing Wang, LunQian Wang, XingHua Zhang, Zekai Zhang, Jinglin
contents	Adapting the Diffusion Probabilistic Model (DPM) for direct image super-resolution is wasteful, given that a simple Convolutional Neural Network (CNN) can recover the main low-frequency content. Therefore, we present ResDiff, a novel Diffusion Probabilistic Model based on Residual structure for Single Image Super-Resolution (SISR). ResDiff utilizes a combination of a CNN, which restores primary low-frequency components, and a DPM, which predicts the residual between the ground-truth image and the CNN predicted image. In contrast to the common diffusion-based methods that directly use LR images to guide the noise towards HR space, ResDiff utilizes the CNN's initial prediction to direct the noise towards the residual space between HR space and CNN-predicted space, which not only accelerates the generation process but also acquires superior sample quality. Additionally, a frequency-domain-based loss function for CNN is introduced to facilitate its restoration, and a frequency-domain guided diffusion is designed for DPM on behalf of predicting high-frequency details. The extensive experiments on multiple benchmark datasets demonstrate that ResDiff outperforms previous diffusion based methods in terms of shorter model convergence time, superior generation quality, and more diverse samples.
format	Preprint
id	arxiv_https___arxiv_org_abs_2303_08714
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	ResDiff: Combining CNN and Diffusion Model for Image Super-Resolution Shang, Shuyao Shan, Zhengyang Liu, Guangxing Wang, LunQian Wang, XingHua Zhang, Zekai Zhang, Jinglin Computer Vision and Pattern Recognition Adapting the Diffusion Probabilistic Model (DPM) for direct image super-resolution is wasteful, given that a simple Convolutional Neural Network (CNN) can recover the main low-frequency content. Therefore, we present ResDiff, a novel Diffusion Probabilistic Model based on Residual structure for Single Image Super-Resolution (SISR). ResDiff utilizes a combination of a CNN, which restores primary low-frequency components, and a DPM, which predicts the residual between the ground-truth image and the CNN predicted image. In contrast to the common diffusion-based methods that directly use LR images to guide the noise towards HR space, ResDiff utilizes the CNN's initial prediction to direct the noise towards the residual space between HR space and CNN-predicted space, which not only accelerates the generation process but also acquires superior sample quality. Additionally, a frequency-domain-based loss function for CNN is introduced to facilitate its restoration, and a frequency-domain guided diffusion is designed for DPM on behalf of predicting high-frequency details. The extensive experiments on multiple benchmark datasets demonstrate that ResDiff outperforms previous diffusion based methods in terms of shorter model convergence time, superior generation quality, and more diverse samples.
title	ResDiff: Combining CNN and Diffusion Model for Image Super-Resolution
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2303.08714

Similar Items