Saved in:
| Main Authors: | Wu, Yangchao, Qin, Zongyue, Wong, Alex, Soatto, Stefano |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.14969 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles
by: Lao, Dong, et al.
Published: (2025)
by: Lao, Dong, et al.
Published: (2025)
Priming: Hybrid State Space Models From Pre-trained Transformers
by: Chattopadhyay, Aditya, et al.
Published: (2026)
by: Chattopadhyay, Aditya, et al.
Published: (2026)
WorDepth: Variational Language Prior for Monocular Depth Estimation
by: Zeng, Ziyao, et al.
Published: (2024)
by: Zeng, Ziyao, et al.
Published: (2024)
PICASO: Permutation-Invariant Context Composition with State Space Models
by: Liu, Tian Yu, et al.
Published: (2025)
by: Liu, Tian Yu, et al.
Published: (2025)
Sub-token ViT Embedding via Stochastic Resonance Transformers
by: Lao, Dong, et al.
Published: (2023)
by: Lao, Dong, et al.
Published: (2023)
AI Agents as Universal Task Solvers
by: Achille, Alessandro, et al.
Published: (2025)
by: Achille, Alessandro, et al.
Published: (2025)
Robust Planning for Autonomous Driving via Mixed Adversarial Diffusion Predictions
by: Zhao, Albert, et al.
Published: (2025)
by: Zhao, Albert, et al.
Published: (2025)
Draft Less, Retrieve More: Hybrid Tree Construction for Speculative Decoding
by: Shen, Yuhao, et al.
Published: (2026)
by: Shen, Yuhao, et al.
Published: (2026)
Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance
by: Wang, Songsheng, et al.
Published: (2025)
by: Wang, Songsheng, et al.
Published: (2025)
QSpec: Speculative Decoding with Complementary Quantization Schemes
by: Zhao, Juntao, et al.
Published: (2024)
by: Zhao, Juntao, et al.
Published: (2024)
Meanings and Feelings of Large Language Models: Observability of Latent States in Generative AI
by: Liu, Tian Yu, et al.
Published: (2024)
by: Liu, Tian Yu, et al.
Published: (2024)
Speculative Safety-Aware Decoding
by: Wang, Xuekang, et al.
Published: (2025)
by: Wang, Xuekang, et al.
Published: (2025)
Dynamic-Width Speculative Beam Decoding for Efficient LLM Inference
by: Qin, Zongyue, et al.
Published: (2024)
by: Qin, Zongyue, et al.
Published: (2024)
Traversal Verification for Speculative Tree Decoding
by: Weng, Yepeng, et al.
Published: (2025)
by: Weng, Yepeng, et al.
Published: (2025)
Critical Learning Periods Emerge Even in Deep Linear Networks
by: Kleinman, Michael, et al.
Published: (2023)
by: Kleinman, Michael, et al.
Published: (2023)
Online Speculative Decoding
by: Liu, Xiaoxuan, et al.
Published: (2023)
by: Liu, Xiaoxuan, et al.
Published: (2023)
Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding
by: Li, Jinze, et al.
Published: (2025)
by: Li, Jinze, et al.
Published: (2025)
Accelerating PayPal's Commerce Agent with Speculative Decoding: An Empirical Study on EAGLE3 with Fine-Tuned Nemotron Models
by: Qin, Ally, et al.
Published: (2026)
by: Qin, Ally, et al.
Published: (2026)
When Drafts Evolve: Speculative Decoding Meets Online Learning
by: Qian, Yu-Yang, et al.
Published: (2026)
by: Qian, Yu-Yang, et al.
Published: (2026)
On Speculative Decoding for Multimodal Large Language Models
by: Gagrani, Mukul, et al.
Published: (2024)
by: Gagrani, Mukul, et al.
Published: (2024)
Mixture of Attentions For Speculative Decoding
by: Zimmer, Matthieu, et al.
Published: (2024)
by: Zimmer, Matthieu, et al.
Published: (2024)
Heuristic Methods are Good Teachers to Distill MLPs for Graph Link Prediction
by: Qin, Zongyue, et al.
Published: (2025)
by: Qin, Zongyue, et al.
Published: (2025)
Confidence-Modulated Speculative Decoding for Large Language Models
by: Sen, Jaydip, et al.
Published: (2025)
by: Sen, Jaydip, et al.
Published: (2025)
Linear Spaces of Meanings: Compositional Structures in Vision-Language Models
by: Trager, Matthew, et al.
Published: (2023)
by: Trager, Matthew, et al.
Published: (2023)
e1: Learning Adaptive Control of Reasoning Effort
by: Kleinman, Michael, et al.
Published: (2025)
by: Kleinman, Michael, et al.
Published: (2025)
FastVLM: Self-Speculative Decoding for Fast Vision-Language Model Inference
by: Bajpai, Divya Jyoti, et al.
Published: (2025)
by: Bajpai, Divya Jyoti, et al.
Published: (2025)
Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding
by: Zhao, Yilong, et al.
Published: (2025)
by: Zhao, Yilong, et al.
Published: (2025)
Faster Cascades via Speculative Decoding
by: Narasimhan, Harikrishna, et al.
Published: (2024)
by: Narasimhan, Harikrishna, et al.
Published: (2024)
DAWM: Diffusion Action World Models for Offline Reinforcement Learning via Action-Inferred Transitions
by: Li, Zongyue, et al.
Published: (2025)
by: Li, Zongyue, et al.
Published: (2025)
Fast Large Language Model Collaborative Decoding via Speculation
by: Fu, Jiale, et al.
Published: (2025)
by: Fu, Jiale, et al.
Published: (2025)
BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms
by: Hou, Yunlong, et al.
Published: (2025)
by: Hou, Yunlong, et al.
Published: (2025)
HiSpec: Hierarchical Speculative Decoding for LLMs
by: Kumar, Avinash, et al.
Published: (2025)
by: Kumar, Avinash, et al.
Published: (2025)
Benchmarking the Energy Savings with Speculative Decoding Strategies
by: Dutta, Rohit, et al.
Published: (2026)
by: Dutta, Rohit, et al.
Published: (2026)
A Theoretical Perspective for Speculative Decoding Algorithm
by: Yin, Ming, et al.
Published: (2024)
by: Yin, Ming, et al.
Published: (2024)
Training Data Protection with Compositional Diffusion Models
by: Golatkar, Aditya, et al.
Published: (2023)
by: Golatkar, Aditya, et al.
Published: (2023)
Experience Sharing in Mutual Reinforcement Learning for Heterogeneous Language Models
by: Liu, Xiaoze, et al.
Published: (2026)
by: Liu, Xiaoze, et al.
Published: (2026)
Re-FORC: Adaptive Reward Prediction for Efficient Chain-of-Thought Reasoning
by: Zabounidis, Renos, et al.
Published: (2025)
by: Zabounidis, Renos, et al.
Published: (2025)
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
by: Tiwari, Rishabh, et al.
Published: (2025)
by: Tiwari, Rishabh, et al.
Published: (2025)
HiViS: Hiding Visual Tokens from the Drafter for Speculative Decoding in Vision-Language Models
by: Xie, Zhinan, et al.
Published: (2025)
by: Xie, Zhinan, et al.
Published: (2025)
Training Domain Draft Models for Speculative Decoding: Best Practices and Insights
by: Hong, Fenglu, et al.
Published: (2025)
by: Hong, Fenglu, et al.
Published: (2025)
Similar Items
-
Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles
by: Lao, Dong, et al.
Published: (2025) -
Priming: Hybrid State Space Models From Pre-trained Transformers
by: Chattopadhyay, Aditya, et al.
Published: (2026) -
WorDepth: Variational Language Prior for Monocular Depth Estimation
by: Zeng, Ziyao, et al.
Published: (2024) -
PICASO: Permutation-Invariant Context Composition with State Space Models
by: Liu, Tian Yu, et al.
Published: (2025) -
Sub-token ViT Embedding via Stochastic Resonance Transformers
by: Lao, Dong, et al.
Published: (2023)