Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Zhang, Yu, Li, Xinchen, Zhou, Jialei, Ma, Hongnan, Wan, Zhongwei, Shi, Yiwei, Miao, Duoqian, Zhang, Qi, Cao, Longbing
Format: Preprint
Veröffentlicht: 2026
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2602.04399
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866918322636849152
author Zhang, Yu
Li, Xinchen
Zhou, Jialei
Ma, Hongnan
Wan, Zhongwei
Shi, Yiwei
Miao, Duoqian
Zhang, Qi
Cao, Longbing
author_facet Zhang, Yu
Li, Xinchen
Zhou, Jialei
Ma, Hongnan
Wan, Zhongwei
Shi, Yiwei
Miao, Duoqian
Zhang, Qi
Cao, Longbing
contents Block-wise decoding effectively improves the inference speed and quality in diffusion language models (DLMs) by combining inter-block sequential denoising and intra-block parallel unmasking. However, existing block-wise decoding methods typically partition blocks in a rigid and fixed manner, which inevitably fragments complete semantic or syntactic constituents, leading to suboptimal performance. Inspired by the entropy reduction hypothesis (ERH), we recognize that constituent boundaries offer greater opportunities for uncertainty reduction, which motivates us to employ entropy analysis for identifying constituent boundaries. Therefore, we propose Swordsman, an entropy-driven adaptive block-wise decoding framework for DLMs. Swordsman adaptively partitions blocks by identifying entropy shifts between adjacent tokens to better align with semantic or syntactic constituent boundaries. In addition, Swordsman dynamically adjusts unmasking thresholds conditioned on the real-time unmasking status within a block, further improving both efficiency and stability. As a training-free framework, supported by KV Cache, Swordsman demonstrates state-of-the-art performance across extensive evaluations.
format Preprint
id arxiv_https___arxiv_org_abs_2602_04399
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Swordsman: Entropy-Driven Adaptive Block Partition for Efficient Diffusion Language Models
Zhang, Yu
Li, Xinchen
Zhou, Jialei
Ma, Hongnan
Wan, Zhongwei
Shi, Yiwei
Miao, Duoqian
Zhang, Qi
Cao, Longbing
Computation and Language
Block-wise decoding effectively improves the inference speed and quality in diffusion language models (DLMs) by combining inter-block sequential denoising and intra-block parallel unmasking. However, existing block-wise decoding methods typically partition blocks in a rigid and fixed manner, which inevitably fragments complete semantic or syntactic constituents, leading to suboptimal performance. Inspired by the entropy reduction hypothesis (ERH), we recognize that constituent boundaries offer greater opportunities for uncertainty reduction, which motivates us to employ entropy analysis for identifying constituent boundaries. Therefore, we propose Swordsman, an entropy-driven adaptive block-wise decoding framework for DLMs. Swordsman adaptively partitions blocks by identifying entropy shifts between adjacent tokens to better align with semantic or syntactic constituent boundaries. In addition, Swordsman dynamically adjusts unmasking thresholds conditioned on the real-time unmasking status within a block, further improving both efficiency and stability. As a training-free framework, supported by KV Cache, Swordsman demonstrates state-of-the-art performance across extensive evaluations.
title Swordsman: Entropy-Driven Adaptive Block Partition for Efficient Diffusion Language Models
topic Computation and Language
url https://arxiv.org/abs/2602.04399