Saved in:
| Main Authors: | Merchant, Humzah, Levy, Bradford |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.31293 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Fast and Effective Solution to the Problem of Look-ahead Bias in LLMs
by: Merchant, Humzah, et al.
Published: (2025)
by: Merchant, Humzah, et al.
Published: (2025)
Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inference
by: Qin, Zongyue, et al.
Published: (2024)
by: Qin, Zongyue, et al.
Published: (2024)
BeanCounter: A low-toxicity, large-scale, and open dataset of business-oriented text
by: Wang, Siyan, et al.
Published: (2024)
by: Wang, Siyan, et al.
Published: (2024)
UCD: Unlearning in LLMs via Contrastive Decoding
by: Suriyakumar, Vinith M., et al.
Published: (2025)
by: Suriyakumar, Vinith M., et al.
Published: (2025)
Auxiliary Metrics Help Decoding Skill Neurons in the Wild
by: Zhao, Yixiu, et al.
Published: (2025)
by: Zhao, Yixiu, et al.
Published: (2025)
(G)I-DLE: Generative Inference via Distribution-preserving Logit Exclusion with KL Divergence Minimization for Constrained Decoding
by: Lee, Hanwool
Published: (2025)
by: Lee, Hanwool
Published: (2025)
RKLD: Reverse KL-Divergence-based Knowledge Distillation for Unlearning Personal Information in Large Language Models
by: Wang, Bichen, et al.
Published: (2024)
by: Wang, Bichen, et al.
Published: (2024)
Revisiting Judge Decoding from First Principles via Training-Free Distributional Divergence
by: Sun, Shengyin, et al.
Published: (2026)
by: Sun, Shengyin, et al.
Published: (2026)
Speculative Streaming: Fast LLM Inference without Auxiliary Models
by: Bhendawade, Nikhil, et al.
Published: (2024)
by: Bhendawade, Nikhil, et al.
Published: (2024)
ParsTranslit: Truly Versatile Tajik-Farsi Transliteration
by: Merchant, Rayyan, et al.
Published: (2025)
by: Merchant, Rayyan, et al.
Published: (2025)
Humans and LLMs Diverge on Probabilistic Inferences
by: Kamath, Gaurav, et al.
Published: (2026)
by: Kamath, Gaurav, et al.
Published: (2026)
Knowing But Not Doing: Convergent Morality and Divergent Action in LLMs
by: Huang, Jen-tse, et al.
Published: (2026)
by: Huang, Jen-tse, et al.
Published: (2026)
Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention
by: Kiruluta, Andrew
Published: (2026)
by: Kiruluta, Andrew
Published: (2026)
Mitigating Biases in Language Models via Bias Unlearning
by: Liu, Dianqing, et al.
Published: (2025)
by: Liu, Dianqing, et al.
Published: (2025)
N-gram-like Language Models Predict Reading Time Best
by: Michaelov, James A., et al.
Published: (2026)
by: Michaelov, James A., et al.
Published: (2026)
OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
by: Dorna, Vineeth, et al.
Published: (2025)
by: Dorna, Vineeth, et al.
Published: (2025)
Avoiding Copyright Infringement via Large Language Model Unlearning
by: Dou, Guangyao, et al.
Published: (2024)
by: Dou, Guangyao, et al.
Published: (2024)
LycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse Decoding
by: Lin, Gang, et al.
Published: (2026)
by: Lin, Gang, et al.
Published: (2026)
Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training
by: Tran, Toan, et al.
Published: (2025)
by: Tran, Toan, et al.
Published: (2025)
CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference
by: Liu, Dong, et al.
Published: (2025)
by: Liu, Dong, et al.
Published: (2025)
AI-VERDE: A Gateway for Egalitarian Access to Large Language Model-Based Resources For Educational Institutions
by: Mithun, Paul, et al.
Published: (2025)
by: Mithun, Paul, et al.
Published: (2025)
LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding
by: Xu, Chenkai, et al.
Published: (2025)
by: Xu, Chenkai, et al.
Published: (2025)
AdaSD: Adaptive Speculative Decoding for Efficient Language Model Inference
by: Lu, Kuan-Wei, et al.
Published: (2025)
by: Lu, Kuan-Wei, et al.
Published: (2025)
Unlearning What Matters: Token-Level Attribution for Precise Language Model Unlearning
by: Wu, Jiawei, et al.
Published: (2026)
by: Wu, Jiawei, et al.
Published: (2026)
Accelerating Transformer Inference for Translation via Parallel Decoding
by: Santilli, Andrea, et al.
Published: (2023)
by: Santilli, Andrea, et al.
Published: (2023)
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators
by: Yang, Dingkang, et al.
Published: (2024)
by: Yang, Dingkang, et al.
Published: (2024)
Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization
by: Guo, Phillip, et al.
Published: (2024)
by: Guo, Phillip, et al.
Published: (2024)
Connecting the Persian-speaking World through Transliteration
by: Merchant, Rayyan, et al.
Published: (2025)
by: Merchant, Rayyan, et al.
Published: (2025)
Training and Inference Efficiency of Encoder-Decoder Speech Models
by: Żelasko, Piotr, et al.
Published: (2025)
by: Żelasko, Piotr, et al.
Published: (2025)
FlashDecoding++: Faster Large Language Model Inference on GPUs
by: Hong, Ke, et al.
Published: (2023)
by: Hong, Ke, et al.
Published: (2023)
Plato: Plan to Efficiently Decode for Large Language Model Inference
by: Jin, Shuowei, et al.
Published: (2024)
by: Jin, Shuowei, et al.
Published: (2024)
Learning-Time Encoding Shapes Unlearning in LLMs
by: Wu, Ruihan, et al.
Published: (2025)
by: Wu, Ruihan, et al.
Published: (2025)
Speculative Decoding for Multi-Sample Inference
by: Li, Yiwei, et al.
Published: (2025)
by: Li, Yiwei, et al.
Published: (2025)
Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge
by: Lu, Weikai, et al.
Published: (2024)
by: Lu, Weikai, et al.
Published: (2024)
Model Unlearning via Sparse Autoencoder Subspace Guided Projections
by: Wang, Xu, et al.
Published: (2025)
by: Wang, Xu, et al.
Published: (2025)
Dissecting Language Models: Machine Unlearning via Selective Pruning
by: Pochinkov, Nicholas, et al.
Published: (2024)
by: Pochinkov, Nicholas, et al.
Published: (2024)
Nudging: Inference-time Alignment of LLMs via Guided Decoding
by: Fei, Yu, et al.
Published: (2024)
by: Fei, Yu, et al.
Published: (2024)
Tutorial Proposal: Speculative Decoding for Efficient LLM Inference
by: Xia, Heming, et al.
Published: (2025)
by: Xia, Heming, et al.
Published: (2025)
Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism
by: Liu, Jiahao, et al.
Published: (2024)
by: Liu, Jiahao, et al.
Published: (2024)
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding
by: Xia, Heming, et al.
Published: (2024)
by: Xia, Heming, et al.
Published: (2024)
Similar Items
-
A Fast and Effective Solution to the Problem of Look-ahead Bias in LLMs
by: Merchant, Humzah, et al.
Published: (2025) -
Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inference
by: Qin, Zongyue, et al.
Published: (2024) -
BeanCounter: A low-toxicity, large-scale, and open dataset of business-oriented text
by: Wang, Siyan, et al.
Published: (2024) -
UCD: Unlearning in LLMs via Contrastive Decoding
by: Suriyakumar, Vinith M., et al.
Published: (2025) -
Auxiliary Metrics Help Decoding Skill Neurons in the Wild
by: Zhao, Yixiu, et al.
Published: (2025)