:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	He, Liang, Wen, Jingbo, Zhan, Qishi, Chen, Yixiong, Cui, Kangning, Lan, Qizhen, Wang, Xilu
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2606.00144
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Draft, Verify, and Improve: Toward Training-Aware Speculative Decoding
by: Bhansali, Shrenik, et al.
Published: (2025)

Cost-Aware Diffusion Draft Trees for Speculative Decoding
by: Zhang, Shuai, et al.
Published: (2026)

Adaptive Signal Resuscitation: Channel-wise Post-Pruning Repair for Sparse Vision Networks
by: Zhan, Qishi, et al.
Published: (2026)

SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding
by: Sun, Ryan, et al.
Published: (2024)

TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding
by: Wang, Zhuoyu, et al.
Published: (2026)

Training Domain Draft Models for Speculative Decoding: Best Practices and Insights
by: Hong, Fenglu, et al.
Published: (2025)

Learning to Draft: Adaptive Speculative Decoding with Reinforcement Learning
by: Zhang, Jiebin, et al.
Published: (2026)

Speculative Decoding with CTC-based Draft Model for LLM Inference Acceleration
by: Wen, Zhuofan, et al.
Published: (2024)

Training-free Dropout Sampling for Semantic Token Acceptance in Speculative Decoding
by: Lee, Jeongtae, et al.
Published: (2026)

AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures
by: Zhang, Situo, et al.
Published: (2024)

DREAM-S: Speculative Decoding with Searchable Drafting and Target-Aware Refinement for Multimodal Generation
by: Liu, Zining, et al.
Published: (2026)

Draft Model Knows When to Stop: Self-Verification Speculative Decoding for Long-Form Generation
by: Zhang, Ziyin, et al.
Published: (2024)

ML-SpecQD: Multi-Level Speculative Decoding with Quantized Drafts
by: Georganas, Evangelos, et al.
Published: (2025)

DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting
by: Lv, Kai, et al.
Published: (2025)

SpecKV: Adaptive Speculative Decoding with Compression-Aware Gamma Selection
by: Shukla, Shikhar
Published: (2026)

FastEagle: Cascaded Drafting for Accelerating Speculative Decoding
by: Huang, Haiduo, et al.
Published: (2025)

PEARL: Parallel Speculative Decoding with Adaptive Draft Length
by: Liu, Tianyu, et al.
Published: (2024)

Accelerating Speculative Decoding with Block Diffusion Draft Trees
by: Ringel, Liran, et al.
Published: (2026)

SpecBranch: Speculative Decoding via Hybrid Drafting and Rollback-Aware Branch Parallelism
by: Shen, Yuhao, et al.
Published: (2025)

SpecTr-GBV: Multi-Draft Block Verification Accelerating Speculative Decoding
by: Lin, Yijun, et al.
Published: (2026)

Draft-OPD: On-Policy Distillation for Speculative Draft Models
by: Lei, Haodi, et al.
Published: (2026)

KOALA: Enhancing Speculative Decoding for LLM via Multi-Layer Draft Heads with Adversarial Learning
by: Zhang, Kaiqi, et al.
Published: (2024)

Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity
by: Metel, Michael R., et al.
Published: (2024)

Make Every Draft Count: Hidden State based Speculative Decoding
by: Chen, Yuetao, et al.
Published: (2026)

Principled Coarse-Grained Acceptance for Speculative Decoding in Speech
by: Yanuka, Moran, et al.
Published: (2025)

Acceptance Dynamics Across Cognitive Domains in Speculative Decoding
by: Mahmoud, Saif
Published: (2026)

TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding
by: Wu, Zhaoxuan, et al.
Published: (2025)

OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
by: Wang, Jikai, et al.
Published: (2024)

MineDraft: A Framework for Batch Parallel Speculative Decoding
by: Tang, Zhenwei, et al.
Published: (2026)

When Drafts Evolve: Speculative Decoding Meets Online Learning
by: Qian, Yu-Yang, et al.
Published: (2026)

POSS: Position Specialist Generates Better Draft for Speculative Decoding
by: Huang, Langlin, et al.
Published: (2025)

Training-Free Loosely Speculative Decoding: Accepting Semantically Correct Drafts Beyond Exact Match
by: Li, Jinze, et al.
Published: (2025)

Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding
by: Shen, Yuhao, et al.
Published: (2026)

SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting
by: Shi, Weijie, et al.
Published: (2026)

AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability
by: Agrawal, Sudhanshu, et al.
Published: (2024)

Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
by: Wang, Zilong, et al.
Published: (2024)

LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding
by: Samarin, Alexander, et al.
Published: (2026)

Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative Decoding
by: Hu, Shijing, et al.
Published: (2025)

Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster Speculative Decoding
by: Zhao, Weilin, et al.
Published: (2024)

Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding
by: Huang, Jianuo, et al.
Published: (2026)