Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Tan, Jiangtong, Zhao, Feng
Format:	Preprint
Publié:	2024
Sujets:	Computer Vision and Pattern Recognition
Accès en ligne:	https://arxiv.org/abs/2406.19030
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866914879256920064
author	Tan, Jiangtong Zhao, Feng
author_facet	Tan, Jiangtong Zhao, Feng
contents	Image restoration aims to enhance low quality images, producing high quality images that exhibit natural visual characteristics and fine semantic attributes. Recently, the diffusion model has emerged as a powerful technique for image generation, and it has been explicitly employed as a backbone in image restoration tasks, yielding excellent results. However, it suffers from the drawbacks of slow inference speed and large model parameters due to its intrinsic characteristics. In this paper, we introduce a new perspective that implicitly leverages the diffusion model to assist the training of image restoration network, called DiffLoss, which drives the restoration results to be optimized for naturalness and semantic-aware visual effect. To achieve this, we utilize the mode coverage capability of the diffusion model to approximate the distribution of natural images and explore its ability to capture image semantic attributes. On the one hand, we extract intermediate noise to leverage its modeling capability of the distribution of natural images, which serves as a naturalness-oriented optimization space. On the other hand, we utilize the bottleneck features of diffusion model to harness its semantic attributes serving as a constraint on semantic level. By combining these two designs, the overall loss function is able to improve the perceptual quality of image restoration, resulting in visually pleasing and semantically enhanced outcomes. To validate the effectiveness of our method, we conduct experiments on various common image restoration tasks and benchmarks. Extensive experimental results demonstrate that our approach enhances the visual quality and semantic perception of the restoration network.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_19030
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	DiffLoss: unleashing diffusion model as constraint for training image restoration network Tan, Jiangtong Zhao, Feng Computer Vision and Pattern Recognition Image restoration aims to enhance low quality images, producing high quality images that exhibit natural visual characteristics and fine semantic attributes. Recently, the diffusion model has emerged as a powerful technique for image generation, and it has been explicitly employed as a backbone in image restoration tasks, yielding excellent results. However, it suffers from the drawbacks of slow inference speed and large model parameters due to its intrinsic characteristics. In this paper, we introduce a new perspective that implicitly leverages the diffusion model to assist the training of image restoration network, called DiffLoss, which drives the restoration results to be optimized for naturalness and semantic-aware visual effect. To achieve this, we utilize the mode coverage capability of the diffusion model to approximate the distribution of natural images and explore its ability to capture image semantic attributes. On the one hand, we extract intermediate noise to leverage its modeling capability of the distribution of natural images, which serves as a naturalness-oriented optimization space. On the other hand, we utilize the bottleneck features of diffusion model to harness its semantic attributes serving as a constraint on semantic level. By combining these two designs, the overall loss function is able to improve the perceptual quality of image restoration, resulting in visually pleasing and semantically enhanced outcomes. To validate the effectiveness of our method, we conduct experiments on various common image restoration tasks and benchmarks. Extensive experimental results demonstrate that our approach enhances the visual quality and semantic perception of the restoration network.
title	DiffLoss: unleashing diffusion model as constraint for training image restoration network
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2406.19030

Documents similaires