Tabla de Contenidos: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Lin, Haitao, Hu, Peiyan, Ren, Minsi, Gao, Zhifeng, Ma, Zhi-Ming, ke, Guolin, Wu, Tailin, Li, Stan Z.
Formato:	Preprint
Publicado:	2025
Materias:	Machine Learning Computer Vision and Pattern Recognition
Acceso en línea:	https://arxiv.org/abs/2512.11831
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Tabla de Contenidos:

Recent advances in few-step diffusion models have demonstrated their efficiency and effectiveness by shortcutting the probabilistic paths of diffusion models, especially in training one-step diffusion models from scratch (\emph{a.k.a.} shortcut models). However, their theoretical derivation and practical implementation are often closely coupled, which obscures the design space. To address this, we propose a common design framework for representative shortcut models. This framework provides theoretical justification for their validity and disentangles concrete component-level choices, thereby enabling systematic identification of improvements. With our proposed improvements, the resulting one-step model achieves a new state-of-the-art FID50k of 2.85 on ImageNet-256x256 under the classifier-free guidance setting with one step generation, and further reaches FID50k of 2.53 with 2x training steps. Remarkably, the model requires no pre-training, distillation, or curriculum learning. We believe our work lowers the barrier to component-level innovation in shortcut models and facilitates principled exploration of their design space.

Ejemplares similares