Staff View: :: Library Catalog

Gardado en:

Detalles Bibliográficos
Main Authors:	Wang, Xu, Li, Zexian, Gong, Litong, Ge, Tiezheng, Deng, Zhijie
Formato:	Preprint
Publicado:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Acceso en liña:	https://arxiv.org/abs/2604.28126
Tags:	Engadir etiqueta Sen Etiquetas, Sexa o primeiro en etiquetar este rexistro!

_version_	1866911637035810816
author	Wang, Xu Li, Zexian Gong, Litong Ge, Tiezheng Deng, Zhijie
author_facet	Wang, Xu Li, Zexian Gong, Litong Ge, Tiezheng Deng, Zhijie
contents	Diffusion models offer superior generation quality at the expense of extensive sampling steps. Distillation methods, with Distribution Matching Distillation (DMD) as a popular example, can mitigate this issue, but performance degradation remains pronounced when sampling steps are limited. Reinforcement learning (RL) has been leveraged to improve the few-step generation quality during distillation, with the potential to even surpass the performance of the teacher model. However, existing approaches are combinatorial in nature, merely integrating an RL process with the distillation process, which introduces unnecessary complexities. To address this gap, we propose AdvDMD, a method that seamlessly unifies DMD distillation and RL. Specifically, AdvDMD employs the adversarially trained discriminator from DMD2 as the reward model, which assigns low scores to generated images and high scores to real ones. It is trained on both intermediate and final states of the denoising process and updated online with the distilled model, enabling a holistic supervision of the sampling trajectories and mitigating reward hacking. We adopt a unified SDE backward simulation and a different training schedule for DMD and RL to enable a more stable and efficient training. Experimental results demonstrate that the 4-step AdvDMD outperforms the original 40-step model for SD3.5 on DPG-Bench, while achieving significant performance gains for SD3 on the GenEval. On Qwen-Image, our 2-step AdvDMD achieves superior performance over TwinFlow.
format	Preprint
id	arxiv_https___arxiv_org_abs_2604_28126
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	AdvDMD: Adversarial Reward Meets DMD For High-Quality Few-Step Generation Wang, Xu Li, Zexian Gong, Litong Ge, Tiezheng Deng, Zhijie Computer Vision and Pattern Recognition Artificial Intelligence Diffusion models offer superior generation quality at the expense of extensive sampling steps. Distillation methods, with Distribution Matching Distillation (DMD) as a popular example, can mitigate this issue, but performance degradation remains pronounced when sampling steps are limited. Reinforcement learning (RL) has been leveraged to improve the few-step generation quality during distillation, with the potential to even surpass the performance of the teacher model. However, existing approaches are combinatorial in nature, merely integrating an RL process with the distillation process, which introduces unnecessary complexities. To address this gap, we propose AdvDMD, a method that seamlessly unifies DMD distillation and RL. Specifically, AdvDMD employs the adversarially trained discriminator from DMD2 as the reward model, which assigns low scores to generated images and high scores to real ones. It is trained on both intermediate and final states of the denoising process and updated online with the distilled model, enabling a holistic supervision of the sampling trajectories and mitigating reward hacking. We adopt a unified SDE backward simulation and a different training schedule for DMD and RL to enable a more stable and efficient training. Experimental results demonstrate that the 4-step AdvDMD outperforms the original 40-step model for SD3.5 on DPG-Bench, while achieving significant performance gains for SD3 on the GenEval. On Qwen-Image, our 2-step AdvDMD achieves superior performance over TwinFlow.
title	AdvDMD: Adversarial Reward Meets DMD For High-Quality Few-Step Generation
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2604.28126

Títulos similares