Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, David, Gushchin, Nikita, Abulkhanov, Dmitry, Moulines, Eric, Oseledets, Ivan, Panov, Maxim, Korotin, Alexander
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.19066
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911735482417152
author	Li, David Gushchin, Nikita Abulkhanov, Dmitry Moulines, Eric Oseledets, Ivan Panov, Maxim Korotin, Alexander
author_facet	Li, David Gushchin, Nikita Abulkhanov, Dmitry Moulines, Eric Oseledets, Ivan Panov, Maxim Korotin, Alexander
contents	Diffusion Language Models (DLMs) have recently achieved strong results in text generation. However, their multi-step sampling leads to slow inference, limiting practical use. To address this, we extend Inverse Distillation, a technique originally developed to accelerate continuous diffusion models, to the discrete setting. Nonetheless, this extension introduces both theoretical and practical challenges. From a theoretical perspective, the inverse distillation objective lacks uniqueness guarantees, which may lead to suboptimal solutions. From a practical standpoint, backpropagation in the discrete space is non-trivial and often unstable. To overcome these challenges, we first provide a theoretical result demonstrating that our inverse formulation admits a unique solution, thereby ensuring valid optimization. We then introduce gradient-stable relaxations to support effective training. As a result, experiments on multiple DLMs show that our method, Inverse-distilled Diffusion Language Models (IDLM), reduces the number of inference steps by 4x-64x, while preserving the teacher model's generation quality. We provide the code, model checkpoints, and video tutorials on the project page: https://david-cripto.github.io/idlm-project-page
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_19066
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	IDLM: Inverse-distilled Diffusion Language Models Li, David Gushchin, Nikita Abulkhanov, Dmitry Moulines, Eric Oseledets, Ivan Panov, Maxim Korotin, Alexander Machine Learning Artificial Intelligence Diffusion Language Models (DLMs) have recently achieved strong results in text generation. However, their multi-step sampling leads to slow inference, limiting practical use. To address this, we extend Inverse Distillation, a technique originally developed to accelerate continuous diffusion models, to the discrete setting. Nonetheless, this extension introduces both theoretical and practical challenges. From a theoretical perspective, the inverse distillation objective lacks uniqueness guarantees, which may lead to suboptimal solutions. From a practical standpoint, backpropagation in the discrete space is non-trivial and often unstable. To overcome these challenges, we first provide a theoretical result demonstrating that our inverse formulation admits a unique solution, thereby ensuring valid optimization. We then introduce gradient-stable relaxations to support effective training. As a result, experiments on multiple DLMs show that our method, Inverse-distilled Diffusion Language Models (IDLM), reduces the number of inference steps by 4x-64x, while preserving the teacher model's generation quality. We provide the code, model checkpoints, and video tutorials on the project page: https://david-cripto.github.io/idlm-project-page
title	IDLM: Inverse-distilled Diffusion Language Models
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2602.19066

Similar Items