Saved in:
| Main Authors: | Song, Jingwei, Wang, Xinyu, Wang, Hanbin, Lei, Xiaoxuan, Shi, Bill, Han, Shixin, Yang, Eric, Chang, Xiao-Wen, Ai, Lynn |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.15498 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Speculative Decoding in Decentralized LLM Inference: Turning Communication Latency into Computation Throughput
by: Song, Jingwei, et al.
Published: (2025)
by: Song, Jingwei, et al.
Published: (2025)
Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification
by: Wang, Jikai, et al.
Published: (2025)
by: Wang, Jikai, et al.
Published: (2025)
MARS: Unleashing the Power of Variance Reduction for Training Large Models
by: Yuan, Huizhuo, et al.
Published: (2024)
by: Yuan, Huizhuo, et al.
Published: (2024)
Traversal Verification for Speculative Tree Decoding
by: Weng, Yepeng, et al.
Published: (2025)
by: Weng, Yepeng, et al.
Published: (2025)
Speculative Decoding: Performance or Illusion?
by: Liu, Xiaoxuan, et al.
Published: (2025)
by: Liu, Xiaoxuan, et al.
Published: (2025)
TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding
by: Wang, Zhuoyu, et al.
Published: (2026)
by: Wang, Zhuoyu, et al.
Published: (2026)
Online Speculative Decoding
by: Liu, Xiaoxuan, et al.
Published: (2023)
by: Liu, Xiaoxuan, et al.
Published: (2023)
Speculative Safety-Aware Decoding
by: Wang, Xuekang, et al.
Published: (2025)
by: Wang, Xuekang, et al.
Published: (2025)
Accelerate Speculative Decoding with Sparse Computation in Verification
by: Wang, Jikai, et al.
Published: (2025)
by: Wang, Jikai, et al.
Published: (2025)
MARS: Margin and Semantic-Aware Data Augmentation for Reward Modeling
by: Bhattacharjee, Payel, et al.
Published: (2026)
by: Bhattacharjee, Payel, et al.
Published: (2026)
Symphony: A Decentralized Multi-Agent Framework for Scalable Collective Intelligence
by: Wang, Ji, et al.
Published: (2025)
by: Wang, Ji, et al.
Published: (2025)
Lattica: A Decentralized Cross-NAT Communication Framework for Scalable AI Inference and Training
by: Yang, Ween, et al.
Published: (2025)
by: Yang, Ween, et al.
Published: (2025)
DIVERSED: Relaxed Speculative Decoding via Dynamic Ensemble Verification
by: Wang, Ziyi, et al.
Published: (2026)
by: Wang, Ziyi, et al.
Published: (2026)
Block Verification Accelerates Speculative Decoding
by: Sun, Ziteng, et al.
Published: (2024)
by: Sun, Ziteng, et al.
Published: (2024)
Speculative Verification: Exploiting Information Gain to Refine Speculative Decoding
by: Kim, Sungkyun, et al.
Published: (2025)
by: Kim, Sungkyun, et al.
Published: (2025)
Efficiency Unleashed: Inference Acceleration for LLM-based Recommender Systems with Speculative Decoding
by: Xi, Yunjia, et al.
Published: (2024)
by: Xi, Yunjia, et al.
Published: (2024)
Making Every Verified Token Count: Adaptive Verification for MoE Speculative Decoding
by: Pan, Lehan, et al.
Published: (2026)
by: Pan, Lehan, et al.
Published: (2026)
Hybrid Verified Decoding: Learning to Allocate Verification in Speculative Decoding
by: Su, Xin, et al.
Published: (2026)
by: Su, Xin, et al.
Published: (2026)
Multi-Agent Collaborative Reward Design for Enhancing Reasoning in Reinforcement Learning
by: Yang, Pei, et al.
Published: (2025)
by: Yang, Pei, et al.
Published: (2025)
Multi-Candidate Speculative Decoding
by: Yang, Sen, et al.
Published: (2024)
by: Yang, Sen, et al.
Published: (2024)
The Power of Simple Menus in Robust Selling Mechanisms
by: Wang, Shixin
Published: (2023)
by: Wang, Shixin
Published: (2023)
From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
by: Purohit, Kiran, et al.
Published: (2026)
by: Purohit, Kiran, et al.
Published: (2026)
Scaling Laws for Speculative Decoding
by: Yan, Siyuan, et al.
Published: (2025)
by: Yan, Siyuan, et al.
Published: (2025)
TriSpec: Ternary Speculative Decoding via Lightweight Proxy Verification
by: Jiang, Haoyun, et al.
Published: (2026)
by: Jiang, Haoyun, et al.
Published: (2026)
AdaSpec: Adaptive Speculative Decoding for Fast, SLO-Aware Large Language Model Serving
by: Huang, Kaiyu, et al.
Published: (2025)
by: Huang, Kaiyu, et al.
Published: (2025)
Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge
by: Xiao, Bin, et al.
Published: (2024)
by: Xiao, Bin, et al.
Published: (2024)
Speculative Contrastive Decoding
by: Yuan, Hongyi, et al.
Published: (2023)
by: Yuan, Hongyi, et al.
Published: (2023)
LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
by: Yang, Penghui, et al.
Published: (2025)
by: Yang, Penghui, et al.
Published: (2025)
Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding
by: Wang, Yixuan, et al.
Published: (2025)
by: Wang, Yixuan, et al.
Published: (2025)
TALON: Confidence-Aware Speculative Decoding with Adaptive Token Trees
by: Liu, Tianyu, et al.
Published: (2026)
by: Liu, Tianyu, et al.
Published: (2026)
Speeding up Speculative Decoding via Sequential Approximate Verification
by: Zhong, Meiyu, et al.
Published: (2025)
by: Zhong, Meiyu, et al.
Published: (2025)
Group Pattern Selection Optimization: Let LRMs Pick the Right Pattern for Reasoning
by: Wang, Hanbin, et al.
Published: (2026)
by: Wang, Hanbin, et al.
Published: (2026)
Speculative Speculative Decoding
by: Kumar, Tanishq, et al.
Published: (2026)
by: Kumar, Tanishq, et al.
Published: (2026)
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
by: Wang, Jikai, et al.
Published: (2024)
by: Wang, Jikai, et al.
Published: (2024)
BudgetDraft: Acceptance-Aware Multi-View Training for Sparse-KV Speculative Decoding
by: He, Liang, et al.
Published: (2026)
by: He, Liang, et al.
Published: (2026)
Opt-Verifier: Unleashing the Power of LLMs for Optimization Modeling via Dual-Side Verification
by: Liu, Haoyang, et al.
Published: (2026)
by: Liu, Haoyang, et al.
Published: (2026)
EAGLE-Pangu: Accelerator-Safe Tree Speculative Decoding on Ascend NPUs
by: Han, Chang, et al.
Published: (2026)
by: Han, Chang, et al.
Published: (2026)
VVS: Accelerating Speculative Decoding for Visual Autoregressive Generation via Partial Verification Skipping
by: Dong, Haotian, et al.
Published: (2025)
by: Dong, Haotian, et al.
Published: (2025)
Mamba Drafters for Speculative Decoding
by: Choi, Daewon, et al.
Published: (2025)
by: Choi, Daewon, et al.
Published: (2025)
Clover-2: Accurate Inference for Regressive Lightweight Speculative Decoding
by: Xiao, Bin, et al.
Published: (2024)
by: Xiao, Bin, et al.
Published: (2024)
Similar Items
-
Speculative Decoding in Decentralized LLM Inference: Turning Communication Latency into Computation Throughput
by: Song, Jingwei, et al.
Published: (2025) -
Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification
by: Wang, Jikai, et al.
Published: (2025) -
MARS: Unleashing the Power of Variance Reduction for Training Large Models
by: Yuan, Huizhuo, et al.
Published: (2024) -
Traversal Verification for Speculative Tree Decoding
by: Weng, Yepeng, et al.
Published: (2025) -
Speculative Decoding: Performance or Illusion?
by: Liu, Xiaoxuan, et al.
Published: (2025)