Salvato in:
| Autori principali: | , , , , , |
|---|---|
| Natura: | Preprint |
| Pubblicazione: |
2026
|
| Soggetti: | |
| Accesso online: | https://arxiv.org/abs/2602.00792 |
| Tags: |
Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
|
| _version_ | 1866914298549239808 |
|---|---|
| author | Chen, Guinan Huang, Xunpeng Sun, Ying Wang, Shijin Zhang, Yanyong Wang, Chao |
| author_facet | Chen, Guinan Huang, Xunpeng Sun, Ying Wang, Shijin Zhang, Yanyong Wang, Chao |
| contents | Masked discrete diffusion is a dominant paradigm for high-quality language modeling where tokens are iteratively corrupted to a mask state, yet its inference efficiency is bottlenecked by the lack of deterministic sampling tools. While diffusion duality enables deterministic distillation for uniform models, these approaches generally underperform masked models and rely on complex integral operators. Conversely, in the masked domain, prior methods typically assume the absence of deterministic trajectories, forcing a reliance on stochastic distillation. To bridge this gap, we establish explicit Masked Diffusion Duality, proving that the masked process arises as the projection of a continuous Gaussian process via a novel maximum-value index preservation mechanism. Furthermore, we introduce Masked Consistency Distillation (MCD), a principled framework that leverages this duality to analytically construct the deterministic coupled trajectories required for consistency distillation, bypassing numerical ODE solvers. This result strictly improves upon prior stochastic distillation methods, achieving a 16$\times$ inference speedup without compromising generation quality. Our findings not only provide a solid theoretical foundation connecting masked and continuous diffusion, but also unlock the full potential of consistency distillation for high-performance discrete generation. Our code is available at https://anonymous.4open.science/r/MCD-70FD. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_00792 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Latent Shadows: The Gaussian-Discrete Duality in Masked Diffusion Chen, Guinan Huang, Xunpeng Sun, Ying Wang, Shijin Zhang, Yanyong Wang, Chao Machine Learning Artificial Intelligence Masked discrete diffusion is a dominant paradigm for high-quality language modeling where tokens are iteratively corrupted to a mask state, yet its inference efficiency is bottlenecked by the lack of deterministic sampling tools. While diffusion duality enables deterministic distillation for uniform models, these approaches generally underperform masked models and rely on complex integral operators. Conversely, in the masked domain, prior methods typically assume the absence of deterministic trajectories, forcing a reliance on stochastic distillation. To bridge this gap, we establish explicit Masked Diffusion Duality, proving that the masked process arises as the projection of a continuous Gaussian process via a novel maximum-value index preservation mechanism. Furthermore, we introduce Masked Consistency Distillation (MCD), a principled framework that leverages this duality to analytically construct the deterministic coupled trajectories required for consistency distillation, bypassing numerical ODE solvers. This result strictly improves upon prior stochastic distillation methods, achieving a 16$\times$ inference speedup without compromising generation quality. Our findings not only provide a solid theoretical foundation connecting masked and continuous diffusion, but also unlock the full potential of consistency distillation for high-performance discrete generation. Our code is available at https://anonymous.4open.science/r/MCD-70FD. |
| title | Latent Shadows: The Gaussian-Discrete Duality in Masked Diffusion |
| topic | Machine Learning Artificial Intelligence |
| url | https://arxiv.org/abs/2602.00792 |