:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Chen, Jian, Liang, Yesheng, Liu, Zhijian
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Computation and Language
Online-Zugang:	https://arxiv.org/abs/2602.06036
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Accelerating Speculative Decoding with Block Diffusion Draft Trees
von: Ringel, Liran, et al.
Veröffentlicht: (2026)

SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting
von: Shi, Weijie, et al.
Veröffentlicht: (2026)

ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
von: Liang, Yesheng, et al.
Veröffentlicht: (2025)

DFlare: Scaling Up Draft Capacity for Block Diffusion Speculative Decoding
von: Zhang, Jiebin, et al.
Veröffentlicht: (2026)

FlashBlock: Attention Caching for Efficient Long-Context Block Diffusion
von: Chen, Zhuokun, et al.
Veröffentlicht: (2026)

Block Verification Accelerates Speculative Decoding
von: Sun, Ziteng, et al.
Veröffentlicht: (2024)

DART: Diffusion-Inspired Speculative Decoding for Fast LLM Inference
von: Liu, Fuliang, et al.
Veröffentlicht: (2026)

PSD: Pushing the Pareto Frontier of Diffusion LLMs via Parallel Speculative Decoding
von: Sun, Shengyin, et al.
Veröffentlicht: (2026)

SpecTr-GBV: Multi-Draft Block Verification Accelerating Speculative Decoding
von: Lin, Yijun, et al.
Veröffentlicht: (2026)

Speculative Decoding with a Speculative Vocabulary
von: Williams, Miles, et al.
Veröffentlicht: (2026)

Cost-Aware Diffusion Draft Trees for Speculative Decoding
von: Zhang, Shuai, et al.
Veröffentlicht: (2026)

Self Speculative Decoding for Diffusion Large Language Models
von: Gao, Yifeng, et al.
Veröffentlicht: (2025)

Decoding Speculative Decoding
von: Yan, Minghao, et al.
Veröffentlicht: (2024)

Multi-Candidate Speculative Decoding
von: Yang, Sen, et al.
Veröffentlicht: (2024)

Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion
von: Christopher, Jacob K, et al.
Veröffentlicht: (2024)

Graph-Structured Speculative Decoding
von: Gong, Zhuocheng, et al.
Veröffentlicht: (2024)

Speculative Contrastive Decoding
von: Yuan, Hongyi, et al.
Veröffentlicht: (2023)

Speculate Deep and Accurate: Lossless and Training-Free Acceleration for Offloaded LLMs via Substitute Speculative Decoding
von: Wang, Pei-Shuo, et al.
Veröffentlicht: (2025)

DualDiffusion: A Speculative Decoding Strategy for Masked Diffusion Models
von: Goyal, Satyam, et al.
Veröffentlicht: (2026)

Scaling Laws for Speculative Decoding
von: Yan, Siyuan, et al.
Veröffentlicht: (2025)

Speculative Decoding: Performance or Illusion?
von: Liu, Xiaoxuan, et al.
Veröffentlicht: (2025)

Block Sparse Flash Attention
von: Ohayon, Daniel, et al.
Veröffentlicht: (2025)

Online Speculative Decoding
von: Liu, Xiaoxuan, et al.
Veröffentlicht: (2023)

3-Model Speculative Decoding
von: Byun, Sanghyun, et al.
Veröffentlicht: (2025)

Speculative Verification: Exploiting Information Gain to Refine Speculative Decoding
von: Kim, Sungkyun, et al.
Veröffentlicht: (2025)

DiffuSpec: Unlocking Diffusion Language Models for Speculative Decoding
von: Li, Guanghao, et al.
Veröffentlicht: (2025)

Speculative Decoding and Beyond: An In-Depth Survey of Techniques
von: Hu, Yunhai, et al.
Veröffentlicht: (2025)

Factorization-Error-Free Discrete Diffusion Language Model via Speculative Decoding
von: Fang, Xun, et al.
Veröffentlicht: (2026)

SpecDiff-2: Scaling Diffusion Drafter Alignment For Faster Speculative Decoding
von: Sandler, Jameson, et al.
Veröffentlicht: (2025)

Traversal Verification for Speculative Tree Decoding
von: Weng, Yepeng, et al.
Veröffentlicht: (2025)

Dynamic Speculation Lookahead Accelerates Speculative Decoding of Large Language Models
von: Mamou, Jonathan, et al.
Veröffentlicht: (2024)

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding
von: Chen, Zhuoming, et al.
Veröffentlicht: (2024)

Learning to Draft: Adaptive Speculative Decoding with Reinforcement Learning
von: Zhang, Jiebin, et al.
Veröffentlicht: (2026)

PEARL: Parallel Speculative Decoding with Adaptive Draft Length
von: Liu, Tianyu, et al.
Veröffentlicht: (2024)

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
von: Sadhukhan, Ranajoy, et al.
Veröffentlicht: (2024)

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
von: Li, Minghan, et al.
Veröffentlicht: (2024)

Improving Multi-candidate Speculative Decoding
von: Lu, Xiaofan, et al.
Veröffentlicht: (2024)

LogitSpec: Accelerating Retrieval-based Speculative Decoding via Next Next Token Speculation
von: Liu, Tianyu, et al.
Veröffentlicht: (2025)

Speculative Decoding Across Languages
von: Paudel, Nirajan, et al.
Veröffentlicht: (2026)

The Disparate Impacts of Speculative Decoding
von: Sandler, Jameson, et al.
Veröffentlicht: (2025)