:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hoshino, Yuichiro, Tachibana, Hideyuki, Inahara, Muneyoshi, Takegawa, Hiroto
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2505.22135
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DistillSpec: Improving Speculative Decoding via Knowledge Distillation
by: Zhou, Yongchao, et al.
Published: (2023)

Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation
by: Gui, Lujun, et al.
Published: (2024)

MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models
by: Ganesan, Mugilan, et al.
Published: (2025)

DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
by: Zarch, Hossein Entezari, et al.
Published: (2025)

Decoding Speculative Decoding
by: Yan, Minghao, et al.
Published: (2024)

Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting
by: Liu, Fangcheng, et al.
Published: (2024)

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders
by: Hu, Yuezhou, et al.
Published: (2025)

Reject Only Critical Tokens: Pivot-Aware Speculative Decoding
by: Ziashahabi, Amir, et al.
Published: (2025)

Faster Cascades via Speculative Decoding
by: Narasimhan, Harikrishna, et al.
Published: (2024)

Speculative Decoding Across Languages
by: Paudel, Nirajan, et al.
Published: (2026)

Self-Speculative Biased Decoding for Faster Re-Translation
by: Zeng, Linxiao, et al.
Published: (2025)

Fast Large Language Model Collaborative Decoding via Speculation
by: Fu, Jiale, et al.
Published: (2025)

ConfLayers: Adaptive Confidence-based Layer Skipping for Self-Speculative Decoding
by: Amer, Walaa, et al.
Published: (2026)

Online Speculative Decoding
by: Liu, Xiaoxuan, et al.
Published: (2023)

Scaling Speculative Decoding with Lookahead Reasoning
by: Fu, Yichao, et al.
Published: (2025)

Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding
by: Li, Jinze, et al.
Published: (2025)

Recurrent Drafter for Fast Speculative Decoding in Large Language Models
by: Cheng, Yunfei, et al.
Published: (2024)

On Speculative Decoding for Multimodal Large Language Models
by: Gagrani, Mukul, et al.
Published: (2024)

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification
by: Zhou, Yuhang, et al.
Published: (2026)

Out-of-Vocabulary Sampling Boosts Speculative Decoding
by: Timor, Nadav, et al.
Published: (2025)

Mixture of Attentions For Speculative Decoding
by: Zimmer, Matthieu, et al.
Published: (2024)

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
by: Elhoushi, Mostafa, et al.
Published: (2024)

DualDiffusion: A Speculative Decoding Strategy for Masked Diffusion Models
by: Goyal, Satyam, et al.
Published: (2026)

Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding
by: Iso, Hayate, et al.
Published: (2026)

Confidence-Modulated Speculative Decoding for Large Language Models
by: Sen, Jaydip, et al.
Published: (2025)

VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs
by: Goel, Raghavv, et al.
Published: (2025)

ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
by: Xiao, Zilin, et al.
Published: (2024)

Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding
by: Sun, Shuoyang, et al.
Published: (2026)

Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment
by: Bachmann, Gregor, et al.
Published: (2025)

Reviving Any-Subset Autoregressive Models with Principled Parallel Sampling and Speculative Decoding
by: Guo, Gabe, et al.
Published: (2025)

Traversal Verification for Speculative Tree Decoding
by: Weng, Yepeng, et al.
Published: (2025)

Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion
by: Christopher, Jacob K, et al.
Published: (2024)

LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding
by: Samarin, Alexander, et al.
Published: (2026)

Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding
by: Agrawal, Sudhanshu, et al.
Published: (2025)

SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths
by: Huang, Kaixuan, et al.
Published: (2024)

HiSpec: Hierarchical Speculative Decoding for LLMs
by: Kumar, Avinash, et al.
Published: (2025)

Benchmarking the Energy Savings with Speculative Decoding Strategies
by: Dutta, Rohit, et al.
Published: (2026)

A Theoretical Perspective for Speculative Decoding Algorithm
by: Yin, Ming, et al.
Published: (2024)

Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
by: Maheswaran, Monishwaran, et al.
Published: (2025)

TapOut: A Bandit-Based Approach to Dynamic Speculative Decoding
by: Sridhar, Aditya, et al.
Published: (2025)