Saved in:
| Main Authors: | He, Liang, Wen, Jingbo, Zhan, Qishi, Chen, Yixiong, Cui, Kangning, Lan, Qizhen, Wang, Xilu |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2606.00144 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Draft, Verify, and Improve: Toward Training-Aware Speculative Decoding
by: Bhansali, Shrenik, et al.
Published: (2025)
by: Bhansali, Shrenik, et al.
Published: (2025)
Cost-Aware Diffusion Draft Trees for Speculative Decoding
by: Zhang, Shuai, et al.
Published: (2026)
by: Zhang, Shuai, et al.
Published: (2026)
Adaptive Signal Resuscitation: Channel-wise Post-Pruning Repair for Sparse Vision Networks
by: Zhan, Qishi, et al.
Published: (2026)
by: Zhan, Qishi, et al.
Published: (2026)
SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding
by: Sun, Ryan, et al.
Published: (2024)
by: Sun, Ryan, et al.
Published: (2024)
TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding
by: Wang, Zhuoyu, et al.
Published: (2026)
by: Wang, Zhuoyu, et al.
Published: (2026)
Training Domain Draft Models for Speculative Decoding: Best Practices and Insights
by: Hong, Fenglu, et al.
Published: (2025)
by: Hong, Fenglu, et al.
Published: (2025)
Learning to Draft: Adaptive Speculative Decoding with Reinforcement Learning
by: Zhang, Jiebin, et al.
Published: (2026)
by: Zhang, Jiebin, et al.
Published: (2026)
Speculative Decoding with CTC-based Draft Model for LLM Inference Acceleration
by: Wen, Zhuofan, et al.
Published: (2024)
by: Wen, Zhuofan, et al.
Published: (2024)
Training-free Dropout Sampling for Semantic Token Acceptance in Speculative Decoding
by: Lee, Jeongtae, et al.
Published: (2026)
by: Lee, Jeongtae, et al.
Published: (2026)
AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures
by: Zhang, Situo, et al.
Published: (2024)
by: Zhang, Situo, et al.
Published: (2024)
DREAM-S: Speculative Decoding with Searchable Drafting and Target-Aware Refinement for Multimodal Generation
by: Liu, Zining, et al.
Published: (2026)
by: Liu, Zining, et al.
Published: (2026)
Draft Model Knows When to Stop: Self-Verification Speculative Decoding for Long-Form Generation
by: Zhang, Ziyin, et al.
Published: (2024)
by: Zhang, Ziyin, et al.
Published: (2024)
ML-SpecQD: Multi-Level Speculative Decoding with Quantized Drafts
by: Georganas, Evangelos, et al.
Published: (2025)
by: Georganas, Evangelos, et al.
Published: (2025)
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting
by: Lv, Kai, et al.
Published: (2025)
by: Lv, Kai, et al.
Published: (2025)
SpecKV: Adaptive Speculative Decoding with Compression-Aware Gamma Selection
by: Shukla, Shikhar
Published: (2026)
by: Shukla, Shikhar
Published: (2026)
FastEagle: Cascaded Drafting for Accelerating Speculative Decoding
by: Huang, Haiduo, et al.
Published: (2025)
by: Huang, Haiduo, et al.
Published: (2025)
PEARL: Parallel Speculative Decoding with Adaptive Draft Length
by: Liu, Tianyu, et al.
Published: (2024)
by: Liu, Tianyu, et al.
Published: (2024)
Accelerating Speculative Decoding with Block Diffusion Draft Trees
by: Ringel, Liran, et al.
Published: (2026)
by: Ringel, Liran, et al.
Published: (2026)
SpecBranch: Speculative Decoding via Hybrid Drafting and Rollback-Aware Branch Parallelism
by: Shen, Yuhao, et al.
Published: (2025)
by: Shen, Yuhao, et al.
Published: (2025)
SpecTr-GBV: Multi-Draft Block Verification Accelerating Speculative Decoding
by: Lin, Yijun, et al.
Published: (2026)
by: Lin, Yijun, et al.
Published: (2026)
Draft-OPD: On-Policy Distillation for Speculative Draft Models
by: Lei, Haodi, et al.
Published: (2026)
by: Lei, Haodi, et al.
Published: (2026)
KOALA: Enhancing Speculative Decoding for LLM via Multi-Layer Draft Heads with Adversarial Learning
by: Zhang, Kaiqi, et al.
Published: (2024)
by: Zhang, Kaiqi, et al.
Published: (2024)
Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity
by: Metel, Michael R., et al.
Published: (2024)
by: Metel, Michael R., et al.
Published: (2024)
Make Every Draft Count: Hidden State based Speculative Decoding
by: Chen, Yuetao, et al.
Published: (2026)
by: Chen, Yuetao, et al.
Published: (2026)
Principled Coarse-Grained Acceptance for Speculative Decoding in Speech
by: Yanuka, Moran, et al.
Published: (2025)
by: Yanuka, Moran, et al.
Published: (2025)
Acceptance Dynamics Across Cognitive Domains in Speculative Decoding
by: Mahmoud, Saif
Published: (2026)
by: Mahmoud, Saif
Published: (2026)
TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding
by: Wu, Zhaoxuan, et al.
Published: (2025)
by: Wu, Zhaoxuan, et al.
Published: (2025)
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
by: Wang, Jikai, et al.
Published: (2024)
by: Wang, Jikai, et al.
Published: (2024)
MineDraft: A Framework for Batch Parallel Speculative Decoding
by: Tang, Zhenwei, et al.
Published: (2026)
by: Tang, Zhenwei, et al.
Published: (2026)
When Drafts Evolve: Speculative Decoding Meets Online Learning
by: Qian, Yu-Yang, et al.
Published: (2026)
by: Qian, Yu-Yang, et al.
Published: (2026)
POSS: Position Specialist Generates Better Draft for Speculative Decoding
by: Huang, Langlin, et al.
Published: (2025)
by: Huang, Langlin, et al.
Published: (2025)
Training-Free Loosely Speculative Decoding: Accepting Semantically Correct Drafts Beyond Exact Match
by: Li, Jinze, et al.
Published: (2025)
by: Li, Jinze, et al.
Published: (2025)
Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding
by: Shen, Yuhao, et al.
Published: (2026)
by: Shen, Yuhao, et al.
Published: (2026)
SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting
by: Shi, Weijie, et al.
Published: (2026)
by: Shi, Weijie, et al.
Published: (2026)
AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability
by: Agrawal, Sudhanshu, et al.
Published: (2024)
by: Agrawal, Sudhanshu, et al.
Published: (2024)
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
by: Wang, Zilong, et al.
Published: (2024)
by: Wang, Zilong, et al.
Published: (2024)
LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding
by: Samarin, Alexander, et al.
Published: (2026)
by: Samarin, Alexander, et al.
Published: (2026)
Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative Decoding
by: Hu, Shijing, et al.
Published: (2025)
by: Hu, Shijing, et al.
Published: (2025)
Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster Speculative Decoding
by: Zhao, Weilin, et al.
Published: (2024)
by: Zhao, Weilin, et al.
Published: (2024)
Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding
by: Huang, Jianuo, et al.
Published: (2026)
by: Huang, Jianuo, et al.
Published: (2026)
Similar Items
-
Draft, Verify, and Improve: Toward Training-Aware Speculative Decoding
by: Bhansali, Shrenik, et al.
Published: (2025) -
Cost-Aware Diffusion Draft Trees for Speculative Decoding
by: Zhang, Shuai, et al.
Published: (2026) -
Adaptive Signal Resuscitation: Channel-wise Post-Pruning Repair for Sparse Vision Networks
by: Zhan, Qishi, et al.
Published: (2026) -
SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding
by: Sun, Ryan, et al.
Published: (2024) -
TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding
by: Wang, Zhuoyu, et al.
Published: (2026)