Saved in:
| Main Authors: | Hoshino, Yuichiro, Tachibana, Hideyuki, Inahara, Muneyoshi, Takegawa, Hiroto |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.22135 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DistillSpec: Improving Speculative Decoding via Knowledge Distillation
by: Zhou, Yongchao, et al.
Published: (2023)
by: Zhou, Yongchao, et al.
Published: (2023)
Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation
by: Gui, Lujun, et al.
Published: (2024)
by: Gui, Lujun, et al.
Published: (2024)
MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models
by: Ganesan, Mugilan, et al.
Published: (2025)
by: Ganesan, Mugilan, et al.
Published: (2025)
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
by: Zarch, Hossein Entezari, et al.
Published: (2025)
by: Zarch, Hossein Entezari, et al.
Published: (2025)
Decoding Speculative Decoding
by: Yan, Minghao, et al.
Published: (2024)
by: Yan, Minghao, et al.
Published: (2024)
Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting
by: Liu, Fangcheng, et al.
Published: (2024)
by: Liu, Fangcheng, et al.
Published: (2024)
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders
by: Hu, Yuezhou, et al.
Published: (2025)
by: Hu, Yuezhou, et al.
Published: (2025)
Reject Only Critical Tokens: Pivot-Aware Speculative Decoding
by: Ziashahabi, Amir, et al.
Published: (2025)
by: Ziashahabi, Amir, et al.
Published: (2025)
Faster Cascades via Speculative Decoding
by: Narasimhan, Harikrishna, et al.
Published: (2024)
by: Narasimhan, Harikrishna, et al.
Published: (2024)
Speculative Decoding Across Languages
by: Paudel, Nirajan, et al.
Published: (2026)
by: Paudel, Nirajan, et al.
Published: (2026)
Self-Speculative Biased Decoding for Faster Re-Translation
by: Zeng, Linxiao, et al.
Published: (2025)
by: Zeng, Linxiao, et al.
Published: (2025)
Fast Large Language Model Collaborative Decoding via Speculation
by: Fu, Jiale, et al.
Published: (2025)
by: Fu, Jiale, et al.
Published: (2025)
ConfLayers: Adaptive Confidence-based Layer Skipping for Self-Speculative Decoding
by: Amer, Walaa, et al.
Published: (2026)
by: Amer, Walaa, et al.
Published: (2026)
Online Speculative Decoding
by: Liu, Xiaoxuan, et al.
Published: (2023)
by: Liu, Xiaoxuan, et al.
Published: (2023)
Scaling Speculative Decoding with Lookahead Reasoning
by: Fu, Yichao, et al.
Published: (2025)
by: Fu, Yichao, et al.
Published: (2025)
Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding
by: Li, Jinze, et al.
Published: (2025)
by: Li, Jinze, et al.
Published: (2025)
Recurrent Drafter for Fast Speculative Decoding in Large Language Models
by: Cheng, Yunfei, et al.
Published: (2024)
by: Cheng, Yunfei, et al.
Published: (2024)
On Speculative Decoding for Multimodal Large Language Models
by: Gagrani, Mukul, et al.
Published: (2024)
by: Gagrani, Mukul, et al.
Published: (2024)
OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification
by: Zhou, Yuhang, et al.
Published: (2026)
by: Zhou, Yuhang, et al.
Published: (2026)
Out-of-Vocabulary Sampling Boosts Speculative Decoding
by: Timor, Nadav, et al.
Published: (2025)
by: Timor, Nadav, et al.
Published: (2025)
Mixture of Attentions For Speculative Decoding
by: Zimmer, Matthieu, et al.
Published: (2024)
by: Zimmer, Matthieu, et al.
Published: (2024)
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
by: Elhoushi, Mostafa, et al.
Published: (2024)
by: Elhoushi, Mostafa, et al.
Published: (2024)
DualDiffusion: A Speculative Decoding Strategy for Masked Diffusion Models
by: Goyal, Satyam, et al.
Published: (2026)
by: Goyal, Satyam, et al.
Published: (2026)
Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding
by: Iso, Hayate, et al.
Published: (2026)
by: Iso, Hayate, et al.
Published: (2026)
Confidence-Modulated Speculative Decoding for Large Language Models
by: Sen, Jaydip, et al.
Published: (2025)
by: Sen, Jaydip, et al.
Published: (2025)
VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs
by: Goel, Raghavv, et al.
Published: (2025)
by: Goel, Raghavv, et al.
Published: (2025)
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
by: Xiao, Zilin, et al.
Published: (2024)
by: Xiao, Zilin, et al.
Published: (2024)
Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding
by: Sun, Shuoyang, et al.
Published: (2026)
by: Sun, Shuoyang, et al.
Published: (2026)
Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment
by: Bachmann, Gregor, et al.
Published: (2025)
by: Bachmann, Gregor, et al.
Published: (2025)
Reviving Any-Subset Autoregressive Models with Principled Parallel Sampling and Speculative Decoding
by: Guo, Gabe, et al.
Published: (2025)
by: Guo, Gabe, et al.
Published: (2025)
Traversal Verification for Speculative Tree Decoding
by: Weng, Yepeng, et al.
Published: (2025)
by: Weng, Yepeng, et al.
Published: (2025)
Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion
by: Christopher, Jacob K, et al.
Published: (2024)
by: Christopher, Jacob K, et al.
Published: (2024)
LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding
by: Samarin, Alexander, et al.
Published: (2026)
by: Samarin, Alexander, et al.
Published: (2026)
Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding
by: Agrawal, Sudhanshu, et al.
Published: (2025)
by: Agrawal, Sudhanshu, et al.
Published: (2025)
SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths
by: Huang, Kaixuan, et al.
Published: (2024)
by: Huang, Kaixuan, et al.
Published: (2024)
HiSpec: Hierarchical Speculative Decoding for LLMs
by: Kumar, Avinash, et al.
Published: (2025)
by: Kumar, Avinash, et al.
Published: (2025)
Benchmarking the Energy Savings with Speculative Decoding Strategies
by: Dutta, Rohit, et al.
Published: (2026)
by: Dutta, Rohit, et al.
Published: (2026)
A Theoretical Perspective for Speculative Decoding Algorithm
by: Yin, Ming, et al.
Published: (2024)
by: Yin, Ming, et al.
Published: (2024)
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
by: Maheswaran, Monishwaran, et al.
Published: (2025)
by: Maheswaran, Monishwaran, et al.
Published: (2025)
TapOut: A Bandit-Based Approach to Dynamic Speculative Decoding
by: Sridhar, Aditya, et al.
Published: (2025)
by: Sridhar, Aditya, et al.
Published: (2025)
Similar Items
-
DistillSpec: Improving Speculative Decoding via Knowledge Distillation
by: Zhou, Yongchao, et al.
Published: (2023) -
Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation
by: Gui, Lujun, et al.
Published: (2024) -
MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models
by: Ganesan, Mugilan, et al.
Published: (2025) -
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
by: Zarch, Hossein Entezari, et al.
Published: (2025) -
Decoding Speculative Decoding
by: Yan, Minghao, et al.
Published: (2024)