Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wasala, Jakub, Wrzalski, Bartlomiej, Noculak, Kornelia, Tarasenko, Yuliia, Krupa, Oliwer, Kocon, Jan, Chodak, Grzegorz
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2505.02255
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913827905339392
author	Wasala, Jakub Wrzalski, Bartlomiej Noculak, Kornelia Tarasenko, Yuliia Krupa, Oliwer Kocon, Jan Chodak, Grzegorz
author_facet	Wasala, Jakub Wrzalski, Bartlomiej Noculak, Kornelia Tarasenko, Yuliia Krupa, Oliwer Kocon, Jan Chodak, Grzegorz
contents	This study presents a novel approach to enhance the cost-to-quality ratio of image generation with diffusion models. We hypothesize that differences between distilled (e.g. FLUX.1-schnell) and baseline (e.g. FLUX.1-dev) models are consistent and, therefore, learnable within a specialized domain, like portrait generation. We generate a synthetic paired dataset and train a fast image-to-image translation head. Using two sets of low- and high-quality synthetic images, our model is trained to refine the output of a distilled generator (e.g., FLUX.1-schnell) to a level comparable to a baseline model like FLUX.1-dev, which is more computationally intensive. Our results show that the pipeline, which combines a distilled version of a large generative model with our enhancement layer, delivers similar photorealistic portraits to the baseline version with up to an 82% decrease in computational cost compared to FLUX.1-dev. This study demonstrates the potential for improving the efficiency of AI solutions involving large-scale image generation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_02255
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset Wasala, Jakub Wrzalski, Bartlomiej Noculak, Kornelia Tarasenko, Yuliia Krupa, Oliwer Kocon, Jan Chodak, Grzegorz Computer Vision and Pattern Recognition Artificial Intelligence This study presents a novel approach to enhance the cost-to-quality ratio of image generation with diffusion models. We hypothesize that differences between distilled (e.g. FLUX.1-schnell) and baseline (e.g. FLUX.1-dev) models are consistent and, therefore, learnable within a specialized domain, like portrait generation. We generate a synthetic paired dataset and train a fast image-to-image translation head. Using two sets of low- and high-quality synthetic images, our model is trained to refine the output of a distilled generator (e.g., FLUX.1-schnell) to a level comparable to a baseline model like FLUX.1-dev, which is more computationally intensive. Our results show that the pipeline, which combines a distilled version of a large generative model with our enhancement layer, delivers similar photorealistic portraits to the baseline version with up to an 82% decrease in computational cost compared to FLUX.1-dev. This study demonstrates the potential for improving the efficiency of AI solutions involving large-scale image generation.
title	Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2505.02255

Similar Items