:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Zhong, Shuzhang, Lu, Baotong, Chen, Qi, Liu, Chuanjie, Yang, Fan, Li, Meng
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Machine Learning
Online-Zugang:	https://arxiv.org/abs/2603.07416
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Breaking the Reward Barrier: Accelerating Tree-of-Thought Reasoning via Speculative Exploration
von: Zhong, Shuzhang, et al.
Veröffentlicht: (2026)

SpecExit: Accelerating Large Reasoning Model via Speculative Exit
von: Yang, Rubing, et al.
Veröffentlicht: (2025)

DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model
von: Zhao, Lei, et al.
Veröffentlicht: (2025)

SpecASR: Accelerating LLM-based Automatic Speech Recognition via Speculative Decoding
von: Wei, Linye, et al.
Veröffentlicht: (2025)

SpecPipe: Accelerating Pipeline Parallelism-based LLM Inference with Speculative Decoding
von: Yin, Haofei, et al.
Veröffentlicht: (2025)

TriSpec: Ternary Speculative Decoding via Lightweight Proxy Verification
von: Jiang, Haoyun, et al.
Veröffentlicht: (2026)

SpecDiff: Accelerating Diffusion Model Inference with Self-Speculation
von: Pan, Jiayi, et al.
Veröffentlicht: (2025)

FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling
von: Zhao, Weilin, et al.
Veröffentlicht: (2025)

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
von: Liu, Di, et al.
Veröffentlicht: (2024)

BubbleSpec: Turning Long-Tail Bubbles into Speculative Rollout Drafts for Synchronous Reinforcement Learning
von: Xu, Yuhang, et al.
Veröffentlicht: (2026)

Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance
von: Wang, Songsheng, et al.
Veröffentlicht: (2025)

BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms
von: Hou, Yunlong, et al.
Veröffentlicht: (2025)

SlimSpec: Low-Rank Draft LM-Head for Accelerated Speculative Decoding
von: Plaksin, Anton, et al.
Veröffentlicht: (2026)

CAS-Spec: Cascade Adaptive Self-Speculative Decoding for On-the-Fly Lossless Inference Acceleration of LLMs
von: Ning, Zhiyuan, et al.
Veröffentlicht: (2025)

MoE-Spec: Expert Budgeting for Efficient Speculative Decoding
von: McDanel, Bradley, et al.
Veröffentlicht: (2026)

SpecMemo: Speculative Decoding is in Your Pocket
von: Yildirim, Selin, et al.
Veröffentlicht: (2025)

SpecAttn: Speculating Sparse Attention
von: Shah, Harsh
Veröffentlicht: (2025)

CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising Quality
von: Dumitru, Razvan-Gabriel, et al.
Veröffentlicht: (2025)

ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems
von: Chen, Qiaoling, et al.
Veröffentlicht: (2025)

DistillSpec: Improving Speculative Decoding via Knowledge Distillation
von: Zhou, Yongchao, et al.
Veröffentlicht: (2023)

DySpec: Faster Speculative Decoding with Dynamic Token Tree Structure
von: Xiong, Yunfan, et al.
Veröffentlicht: (2024)

HiSpec: Hierarchical Speculative Decoding for LLMs
von: Kumar, Avinash, et al.
Veröffentlicht: (2025)

ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
von: Xiao, Zilin, et al.
Veröffentlicht: (2024)

KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem
von: Cha, Seongjin, et al.
Veröffentlicht: (2026)

MARS: Co-evolving Dual-System Deep Research via Multi-Agent Reinforcement Learning
von: Chen, Guoxin, et al.
Veröffentlicht: (2025)

SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning
von: Pan, Rui, et al.
Veröffentlicht: (2025)

SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths
von: Huang, Kaixuan, et al.
Veröffentlicht: (2024)

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
von: Zhong, Shuzhang, et al.
Veröffentlicht: (2024)

SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification
von: Miao, Xupeng, et al.
Veröffentlicht: (2023)

SpecTr: Fast Speculative Decoding via Optimal Transport
von: Sun, Ziteng, et al.
Veröffentlicht: (2023)

SpecForge: A Flexible and Efficient Open-Source Training Framework for Speculative Decoding
von: Li, Shenggui, et al.
Veröffentlicht: (2026)

AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference
von: Zhong, Shuzhang, et al.
Veröffentlicht: (2024)

SpecPV: Improving Self-Speculative Decoding for Long-Context Generation via Partial Verification
von: Tan, Zhendong, et al.
Veröffentlicht: (2025)

Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding
von: Sun, Shuoyang, et al.
Veröffentlicht: (2026)

SpecMD: A Comprehensive Study On Speculative Expert Prefetching
von: Hoang, Duc, et al.
Veröffentlicht: (2026)

SpecBound: Adaptive Bounded Self-Speculation with Layer-wise Confidence Calibration
von: Wen, Zhuofan, et al.
Veröffentlicht: (2026)

SpecMER: Fast Protein Generation with K-mer Guided Speculative Decoding
von: Walton, Thomas, et al.
Veröffentlicht: (2025)

EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization
von: Wu, Yize, et al.
Veröffentlicht: (2025)

QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
von: Tiwari, Rishabh, et al.
Veröffentlicht: (2025)

DualOptim: Enhancing Efficacy and Stability in Machine Unlearning with Dual Optimizers
von: Zhong, Xuyang, et al.
Veröffentlicht: (2025)