Saved in:
| Main Authors: | Thomas, Rahul, Kitanovski, Teo, Goldblum, Micah, Pal, Arka |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.16994 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Knowing What You Know Is Not Enough: Large Language Model Confidences Don't Align With Their Actions
by: Pal, Arka, et al.
Published: (2025)
by: Pal, Arka, et al.
Published: (2025)
Greedy Multi-Path Block Verification for Faster Decoding in Speculative Sampling
by: Thomas, Rahul, et al.
Published: (2026)
by: Thomas, Rahul, et al.
Published: (2026)
Global Resolution: Optimal Multi-Draft Speculative Sampling via Convex Minimization
by: Thomas, Rahul Krishna, et al.
Published: (2025)
by: Thomas, Rahul Krishna, et al.
Published: (2025)
Cascade: Token-Sharded Private LLM Inference
by: Thomas, Rahul, et al.
Published: (2025)
by: Thomas, Rahul, et al.
Published: (2025)
An Attack to Break Permutation-Based Private Third-Party Inference Schemes for LLMs
by: Thomas, Rahul, et al.
Published: (2025)
by: Thomas, Rahul, et al.
Published: (2025)
vTune: Verifiable Fine-Tuning for LLMs Through Backdooring
by: Zhang, Eva, et al.
Published: (2024)
by: Zhang, Eva, et al.
Published: (2024)
Privacy-Preserving Mechanisms Enable Cheap Verifiable Inference of LLMs
by: Pal, Arka, et al.
Published: (2026)
by: Pal, Arka, et al.
Published: (2026)
Large Language Models Must Be Taught to Know What They Don't Know
by: Kapoor, Sanyam, et al.
Published: (2024)
by: Kapoor, Sanyam, et al.
Published: (2024)
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
by: Goldblum, Micah, et al.
Published: (2023)
by: Goldblum, Micah, et al.
Published: (2023)
Speculative Speculative Decoding
by: Kumar, Tanishq, et al.
Published: (2026)
by: Kumar, Tanishq, et al.
Published: (2026)
Multi-Token Prediction via Self-Distillation
by: Kirchenbauer, John, et al.
Published: (2026)
by: Kirchenbauer, John, et al.
Published: (2026)
Adaptive Retention & Correction: Test-Time Training for Continual Learning
by: Chen, Haoran, et al.
Published: (2024)
by: Chen, Haoran, et al.
Published: (2024)
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful
by: Marek, Martin, et al.
Published: (2025)
by: Marek, Martin, et al.
Published: (2025)
Compute Better Spent: Replacing Dense Layers with Structured Matrices
by: Qiu, Shikai, et al.
Published: (2024)
by: Qiu, Shikai, et al.
Published: (2024)
Decoding Speculative Decoding
by: Yan, Minghao, et al.
Published: (2024)
by: Yan, Minghao, et al.
Published: (2024)
The Lie Derivative for Measuring Learned Equivariance
by: Gruver, Nate, et al.
Published: (2022)
by: Gruver, Nate, et al.
Published: (2022)
DySpec: Faster Speculative Decoding with Dynamic Token Tree Structure
by: Xiong, Yunfan, et al.
Published: (2024)
by: Xiong, Yunfan, et al.
Published: (2024)
On Training in Imagination
by: Timor, Nadav, et al.
Published: (2026)
by: Timor, Nadav, et al.
Published: (2026)
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks
by: Li, Ang, et al.
Published: (2025)
by: Li, Ang, et al.
Published: (2025)
Draft, Verify, and Improve: Toward Training-Aware Speculative Decoding
by: Bhansali, Shrenik, et al.
Published: (2025)
by: Bhansali, Shrenik, et al.
Published: (2025)
Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models
by: Lotfi, Sanae, et al.
Published: (2024)
by: Lotfi, Sanae, et al.
Published: (2024)
Traversal Verification for Speculative Tree Decoding
by: Weng, Yepeng, et al.
Published: (2025)
by: Weng, Yepeng, et al.
Published: (2025)
Gemstones: A Model Suite for Multi-Faceted Scaling Laws
by: McLeish, Sean, et al.
Published: (2025)
by: McLeish, Sean, et al.
Published: (2025)
Online Speculative Decoding
by: Liu, Xiaoxuan, et al.
Published: (2023)
by: Liu, Xiaoxuan, et al.
Published: (2023)
DistillSpec: Improving Speculative Decoding via Knowledge Distillation
by: Zhou, Yongchao, et al.
Published: (2023)
by: Zhou, Yongchao, et al.
Published: (2023)
Identifying and Evaluating Inactive Heads in Pretrained LLMs
by: Sandoval-Segura, Pedro, et al.
Published: (2025)
by: Sandoval-Segura, Pedro, et al.
Published: (2025)
Non-Vacuous Generalization Bounds for Large Language Models
by: Lotfi, Sanae, et al.
Published: (2023)
by: Lotfi, Sanae, et al.
Published: (2023)
Speculative Decoding Across Languages
by: Paudel, Nirajan, et al.
Published: (2026)
by: Paudel, Nirajan, et al.
Published: (2026)
Speculative Safety-Aware Decoding
by: Wang, Xuekang, et al.
Published: (2025)
by: Wang, Xuekang, et al.
Published: (2025)
Closing the Train-Test Gap in World Models for Gradient-Based Planning
by: Parthasarathy, Arjun, et al.
Published: (2025)
by: Parthasarathy, Arjun, et al.
Published: (2025)
Just How Flexible are Neural Networks in Practice?
by: Shwartz-Ziv, Ravid, et al.
Published: (2024)
by: Shwartz-Ziv, Ravid, et al.
Published: (2024)
TapOut: A Bandit-Based Approach to Dynamic Speculative Decoding
by: Sridhar, Aditya, et al.
Published: (2025)
by: Sridhar, Aditya, et al.
Published: (2025)
Fast Inference via Hierarchical Speculative Decoding
by: Mohri, Clara, et al.
Published: (2025)
by: Mohri, Clara, et al.
Published: (2025)
Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding
by: Guan, Yue, et al.
Published: (2025)
by: Guan, Yue, et al.
Published: (2025)
SpecPV: Improving Self-Speculative Decoding for Long-Context Generation via Partial Verification
by: Tan, Zhendong, et al.
Published: (2025)
by: Tan, Zhendong, et al.
Published: (2025)
Mixture of Attentions For Speculative Decoding
by: Zimmer, Matthieu, et al.
Published: (2024)
by: Zimmer, Matthieu, et al.
Published: (2024)
Scaling Speculative Decoding with Lookahead Reasoning
by: Fu, Yichao, et al.
Published: (2025)
by: Fu, Yichao, et al.
Published: (2025)
STree: Speculative Tree Decoding for Hybrid State-Space Models
by: Wu, Yangchao, et al.
Published: (2025)
by: Wu, Yangchao, et al.
Published: (2025)
SpecMER: Fast Protein Generation with K-mer Guided Speculative Decoding
by: Walton, Thomas, et al.
Published: (2025)
by: Walton, Thomas, et al.
Published: (2025)
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks
by: Feuer, Benjamin, et al.
Published: (2024)
by: Feuer, Benjamin, et al.
Published: (2024)
Similar Items
-
Knowing What You Know Is Not Enough: Large Language Model Confidences Don't Align With Their Actions
by: Pal, Arka, et al.
Published: (2025) -
Greedy Multi-Path Block Verification for Faster Decoding in Speculative Sampling
by: Thomas, Rahul, et al.
Published: (2026) -
Global Resolution: Optimal Multi-Draft Speculative Sampling via Convex Minimization
by: Thomas, Rahul Krishna, et al.
Published: (2025) -
Cascade: Token-Sharded Private LLM Inference
by: Thomas, Rahul, et al.
Published: (2025) -
An Attack to Break Permutation-Based Private Third-Party Inference Schemes for LLMs
by: Thomas, Rahul, et al.
Published: (2025)