Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Ding, Xin, Chen, Yun, Zhang, Sen, Zhang, Kao, Chen, Nenglun, Cao, Peibei, Wang, Yongwei, Wu, Fei
Format:	Preprint
Publié:	2026
Sujets:	Computer Vision and Pattern Recognition Machine Learning
Accès en ligne:	https://arxiv.org/abs/2602.02114
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866914301548167168
author	Ding, Xin Chen, Yun Zhang, Sen Zhang, Kao Chen, Nenglun Cao, Peibei Wang, Yongwei Wu, Fei
author_facet	Ding, Xin Chen, Yun Zhang, Sen Zhang, Kao Chen, Nenglun Cao, Peibei Wang, Yongwei Wu, Fei
contents	Continuous Conditional Diffusion Model (CCDM) is a diffusion-based framework designed to generate high-quality images conditioned on continuous regression labels. Although CCDM has demonstrated clear advantages over prior approaches across a range of datasets, it still exhibits notable limitations and has recently been surpassed by a GAN-based method, namely CcGAN-AVAR. These limitations mainly arise from its reliance on an outdated diffusion framework and its low sampling efficiency due to long sampling trajectories. To address these issues, we propose an improved CCDM framework, termed iCCDM, which incorporates the more advanced \textit{Elucidated Diffusion Model} (EDM) framework with substantial modifications to improve both generation quality and sampling efficiency. Specifically, iCCDM introduces a novel matrix-form EDM formulation together with an adaptive vicinal training strategy. Extensive experiments on four benchmark datasets, spanning image resolutions from $64\times64$ to $256\times256$, demonstrate that iCCDM consistently outperforms existing methods, including state-of-the-art large-scale text-to-image diffusion models (e.g., Stable Diffusion 3, FLUX.1, and Qwen-Image), achieving higher generation quality while significantly reducing sampling cost.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_02114
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Enhancing Diffusion-Based Quantitatively Controllable Image Generation via Matrix-Form EDM and Adaptive Vicinal Training Ding, Xin Chen, Yun Zhang, Sen Zhang, Kao Chen, Nenglun Cao, Peibei Wang, Yongwei Wu, Fei Computer Vision and Pattern Recognition Machine Learning Continuous Conditional Diffusion Model (CCDM) is a diffusion-based framework designed to generate high-quality images conditioned on continuous regression labels. Although CCDM has demonstrated clear advantages over prior approaches across a range of datasets, it still exhibits notable limitations and has recently been surpassed by a GAN-based method, namely CcGAN-AVAR. These limitations mainly arise from its reliance on an outdated diffusion framework and its low sampling efficiency due to long sampling trajectories. To address these issues, we propose an improved CCDM framework, termed iCCDM, which incorporates the more advanced \textit{Elucidated Diffusion Model} (EDM) framework with substantial modifications to improve both generation quality and sampling efficiency. Specifically, iCCDM introduces a novel matrix-form EDM formulation together with an adaptive vicinal training strategy. Extensive experiments on four benchmark datasets, spanning image resolutions from $64\times64$ to $256\times256$, demonstrate that iCCDM consistently outperforms existing methods, including state-of-the-art large-scale text-to-image diffusion models (e.g., Stable Diffusion 3, FLUX.1, and Qwen-Image), achieving higher generation quality while significantly reducing sampling cost.
title	Enhancing Diffusion-Based Quantitatively Controllable Image Generation via Matrix-Form EDM and Adaptive Vicinal Training
topic	Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2602.02114

Documents similaires