:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Merchant, Humzah, Levy, Bradford
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2605.31293
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Fast and Effective Solution to the Problem of Look-ahead Bias in LLMs
by: Merchant, Humzah, et al.
Published: (2025)

Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inference
by: Qin, Zongyue, et al.
Published: (2024)

BeanCounter: A low-toxicity, large-scale, and open dataset of business-oriented text
by: Wang, Siyan, et al.
Published: (2024)

UCD: Unlearning in LLMs via Contrastive Decoding
by: Suriyakumar, Vinith M., et al.
Published: (2025)

Auxiliary Metrics Help Decoding Skill Neurons in the Wild
by: Zhao, Yixiu, et al.
Published: (2025)

(G)I-DLE: Generative Inference via Distribution-preserving Logit Exclusion with KL Divergence Minimization for Constrained Decoding
by: Lee, Hanwool
Published: (2025)

RKLD: Reverse KL-Divergence-based Knowledge Distillation for Unlearning Personal Information in Large Language Models
by: Wang, Bichen, et al.
Published: (2024)

Revisiting Judge Decoding from First Principles via Training-Free Distributional Divergence
by: Sun, Shengyin, et al.
Published: (2026)

Speculative Streaming: Fast LLM Inference without Auxiliary Models
by: Bhendawade, Nikhil, et al.
Published: (2024)

ParsTranslit: Truly Versatile Tajik-Farsi Transliteration
by: Merchant, Rayyan, et al.
Published: (2025)

Humans and LLMs Diverge on Probabilistic Inferences
by: Kamath, Gaurav, et al.
Published: (2026)

Knowing But Not Doing: Convergent Morality and Divergent Action in LLMs
by: Huang, Jen-tse, et al.
Published: (2026)

Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention
by: Kiruluta, Andrew
Published: (2026)

Mitigating Biases in Language Models via Bias Unlearning
by: Liu, Dianqing, et al.
Published: (2025)

N-gram-like Language Models Predict Reading Time Best
by: Michaelov, James A., et al.
Published: (2026)

OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
by: Dorna, Vineeth, et al.
Published: (2025)

Avoiding Copyright Infringement via Large Language Model Unlearning
by: Dou, Guangyao, et al.
Published: (2024)

LycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse Decoding
by: Lin, Gang, et al.
Published: (2026)

Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training
by: Tran, Toan, et al.
Published: (2025)

CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference
by: Liu, Dong, et al.
Published: (2025)

AI-VERDE: A Gateway for Egalitarian Access to Large Language Model-Based Resources For Educational Institutions
by: Mithun, Paul, et al.
Published: (2025)

LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding
by: Xu, Chenkai, et al.
Published: (2025)

AdaSD: Adaptive Speculative Decoding for Efficient Language Model Inference
by: Lu, Kuan-Wei, et al.
Published: (2025)

Unlearning What Matters: Token-Level Attribution for Precise Language Model Unlearning
by: Wu, Jiawei, et al.
Published: (2026)

Accelerating Transformer Inference for Translation via Parallel Decoding
by: Santilli, Andrea, et al.
Published: (2023)

Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators
by: Yang, Dingkang, et al.
Published: (2024)

Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization
by: Guo, Phillip, et al.
Published: (2024)

Connecting the Persian-speaking World through Transliteration
by: Merchant, Rayyan, et al.
Published: (2025)

Training and Inference Efficiency of Encoder-Decoder Speech Models
by: Żelasko, Piotr, et al.
Published: (2025)

FlashDecoding++: Faster Large Language Model Inference on GPUs
by: Hong, Ke, et al.
Published: (2023)

Plato: Plan to Efficiently Decode for Large Language Model Inference
by: Jin, Shuowei, et al.
Published: (2024)

Learning-Time Encoding Shapes Unlearning in LLMs
by: Wu, Ruihan, et al.
Published: (2025)

Speculative Decoding for Multi-Sample Inference
by: Li, Yiwei, et al.
Published: (2025)

Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge
by: Lu, Weikai, et al.
Published: (2024)

Model Unlearning via Sparse Autoencoder Subspace Guided Projections
by: Wang, Xu, et al.
Published: (2025)

Dissecting Language Models: Machine Unlearning via Selective Pruning
by: Pochinkov, Nicholas, et al.
Published: (2024)

Nudging: Inference-time Alignment of LLMs via Guided Decoding
by: Fei, Yu, et al.
Published: (2024)

Tutorial Proposal: Speculative Decoding for Efficient LLM Inference
by: Xia, Heming, et al.
Published: (2025)

Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism
by: Liu, Jiahao, et al.
Published: (2024)

Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding
by: Xia, Heming, et al.
Published: (2024)