Saved in:
| Main Authors: | Guo, Gabe, Ermon, Stefano |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.20456 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ABC: Any-Subset Autoregression via Non-Markovian Diffusion Bridges in Continuous Time and Space
by: Guo, Gabe, et al.
Published: (2026)
by: Guo, Gabe, et al.
Published: (2026)
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
by: Xiao, Zilin, et al.
Published: (2024)
by: Xiao, Zilin, et al.
Published: (2024)
Out-of-Vocabulary Sampling Boosts Speculative Decoding
by: Timor, Nadav, et al.
Published: (2025)
by: Timor, Nadav, et al.
Published: (2025)
Decoding Speculative Decoding
by: Yan, Minghao, et al.
Published: (2024)
by: Yan, Minghao, et al.
Published: (2024)
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
by: Lou, Aaron, et al.
Published: (2023)
by: Lou, Aaron, et al.
Published: (2023)
Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment
by: Bachmann, Gregor, et al.
Published: (2025)
by: Bachmann, Gregor, et al.
Published: (2025)
AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation
by: Cheng, Dongjie, et al.
Published: (2026)
by: Cheng, Dongjie, et al.
Published: (2026)
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
by: Li, Jia-Nan, et al.
Published: (2025)
by: Li, Jia-Nan, et al.
Published: (2025)
Improving Diffusion Language Model Decoding through Joint Search in Generation Order and Token Space
by: Shen, Yangyi, et al.
Published: (2026)
by: Shen, Yangyi, et al.
Published: (2026)
Speculative Decoding Across Languages
by: Paudel, Nirajan, et al.
Published: (2026)
by: Paudel, Nirajan, et al.
Published: (2026)
MineDraft: A Framework for Batch Parallel Speculative Decoding
by: Tang, Zhenwei, et al.
Published: (2026)
by: Tang, Zhenwei, et al.
Published: (2026)
Boosting Lossless Speculative Decoding via Feature Sampling and Partial Alignment Distillation
by: Gui, Lujun, et al.
Published: (2024)
by: Gui, Lujun, et al.
Published: (2024)
You Need Better Attention Priors
by: Litman, Elon, et al.
Published: (2026)
by: Litman, Elon, et al.
Published: (2026)
RFG: Test-Time Scaling for Diffusion Large Language Model Reasoning with Reward-Free Guidance
by: Chen, Tianlang, et al.
Published: (2025)
by: Chen, Tianlang, et al.
Published: (2025)
Online Speculative Decoding
by: Liu, Xiaoxuan, et al.
Published: (2023)
by: Liu, Xiaoxuan, et al.
Published: (2023)
Scaling Speculative Decoding with Lookahead Reasoning
by: Fu, Yichao, et al.
Published: (2025)
by: Fu, Yichao, et al.
Published: (2025)
Recurrent Drafter for Fast Speculative Decoding in Large Language Models
by: Cheng, Yunfei, et al.
Published: (2024)
by: Cheng, Yunfei, et al.
Published: (2024)
On Speculative Decoding for Multimodal Large Language Models
by: Gagrani, Mukul, et al.
Published: (2024)
by: Gagrani, Mukul, et al.
Published: (2024)
Mixture of Attentions For Speculative Decoding
by: Zimmer, Matthieu, et al.
Published: (2024)
by: Zimmer, Matthieu, et al.
Published: (2024)
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation
by: Manvi, Rohin, et al.
Published: (2024)
by: Manvi, Rohin, et al.
Published: (2024)
Disentangling Length from Quality in Direct Preference Optimization
by: Park, Ryan, et al.
Published: (2024)
by: Park, Ryan, et al.
Published: (2024)
DualDiffusion: A Speculative Decoding Strategy for Masked Diffusion Models
by: Goyal, Satyam, et al.
Published: (2026)
by: Goyal, Satyam, et al.
Published: (2026)
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders
by: Hu, Yuezhou, et al.
Published: (2025)
by: Hu, Yuezhou, et al.
Published: (2025)
Confidence-Modulated Speculative Decoding for Large Language Models
by: Sen, Jaydip, et al.
Published: (2025)
by: Sen, Jaydip, et al.
Published: (2025)
VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs
by: Goel, Raghavv, et al.
Published: (2025)
by: Goel, Raghavv, et al.
Published: (2025)
Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding
by: Sun, Shuoyang, et al.
Published: (2026)
by: Sun, Shuoyang, et al.
Published: (2026)
SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths
by: Huang, Kaixuan, et al.
Published: (2024)
by: Huang, Kaixuan, et al.
Published: (2024)
Principled RL for Diffusion LLMs Emerges from a Sequence-Level Perspective
by: Ou, Jingyang, et al.
Published: (2025)
by: Ou, Jingyang, et al.
Published: (2025)
RAD: Redundancy-Aware Distillation for Hybrid Models via Self-Speculative Decoding
by: Hoshino, Yuichiro, et al.
Published: (2025)
by: Hoshino, Yuichiro, et al.
Published: (2025)
Traversal Verification for Speculative Tree Decoding
by: Weng, Yepeng, et al.
Published: (2025)
by: Weng, Yepeng, et al.
Published: (2025)
Faster Cascades via Speculative Decoding
by: Narasimhan, Harikrishna, et al.
Published: (2024)
by: Narasimhan, Harikrishna, et al.
Published: (2024)
Learning Harmonized Representations for Speculative Sampling
by: Zhang, Lefan, et al.
Published: (2024)
by: Zhang, Lefan, et al.
Published: (2024)
GeoLLM: Extracting Geospatial Knowledge from Large Language Models
by: Manvi, Rohin, et al.
Published: (2023)
by: Manvi, Rohin, et al.
Published: (2023)
Fast Large Language Model Collaborative Decoding via Speculation
by: Fu, Jiale, et al.
Published: (2025)
by: Fu, Jiale, et al.
Published: (2025)
Reject Only Critical Tokens: Pivot-Aware Speculative Decoding
by: Ziashahabi, Amir, et al.
Published: (2025)
by: Ziashahabi, Amir, et al.
Published: (2025)
Why Any-Order Autoregressive Models Need Two-Stream Attention: A Structural-Semantic Tradeoff
by: Pynadath, Patrick, et al.
Published: (2026)
by: Pynadath, Patrick, et al.
Published: (2026)
Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion
by: Christopher, Jacob K, et al.
Published: (2024)
by: Christopher, Jacob K, et al.
Published: (2024)
LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding
by: Samarin, Alexander, et al.
Published: (2026)
by: Samarin, Alexander, et al.
Published: (2026)
DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling
by: Zhou, Linqi, et al.
Published: (2023)
by: Zhou, Linqi, et al.
Published: (2023)
BASS: Batched Attention-optimized Speculative Sampling
by: Qian, Haifeng, et al.
Published: (2024)
by: Qian, Haifeng, et al.
Published: (2024)
Similar Items
-
ABC: Any-Subset Autoregression via Non-Markovian Diffusion Bridges in Continuous Time and Space
by: Guo, Gabe, et al.
Published: (2026) -
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
by: Xiao, Zilin, et al.
Published: (2024) -
Out-of-Vocabulary Sampling Boosts Speculative Decoding
by: Timor, Nadav, et al.
Published: (2025) -
Decoding Speculative Decoding
by: Yan, Minghao, et al.
Published: (2024) -
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
by: Lou, Aaron, et al.
Published: (2023)