:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Phan, Buu, Khisti, Ashish, Ullrich, Karen
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2512.14954
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Understanding and Mitigating Tokenization Bias in Language Models
by: Phan, Buu, et al.
Published: (2024)

Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles
by: Phan, Buu, et al.
Published: (2024)

Channel Simulation and Distributed Compression with Ensemble Rejection Sampling
by: Phan, Buu, et al.
Published: (2025)

List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression
by: Rowan, Joseph, et al.
Published: (2025)

Multi-Marginal Couplings for Metropolis-Hastings
by: Phan, Buu, et al.
Published: (2026)

X-Token: Projection-Guided Cross-Tokenizer Knowledge Distillation
by: Sreenivas, Sharath Turuvekere, et al.
Published: (2026)

Importance Matching Lemma for Lossy Compression with Side Information
by: Phan, Buu, et al.
Published: (2024)

One-Shot Broadcast Joint Source-Channel Coding with Codebook Diversity
by: Rowan, Joseph, et al.
Published: (2026)

On Self-Adaptive Perception Loss Function for Sequential Lossy Compression
by: Salehkalaibar, Sadaf, et al.
Published: (2025)

Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation
by: Phan, Phuc, et al.
Published: (2024)

AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation
by: Zhang, Songming, et al.
Published: (2025)

Multi-Draft Speculative Sampling: Canonical Decomposition and Theoretical Limits
by: Khisti, Ashish, et al.
Published: (2024)

Token Distillation: Attention-aware Input Embeddings For New Tokens
by: Dobler, Konstantin, et al.
Published: (2025)

From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers
by: Su, Jingtong, et al.
Published: (2025)

Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
by: Su, Jingtong, et al.
Published: (2024)

Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
by: Huo, Mingjia, et al.
Published: (2024)

End-To-End Causal Effect Estimation from Unstructured Natural Language Data
by: Dhawan, Nikita, et al.
Published: (2024)

Multi-Token Prediction via Self-Distillation
by: Kirchenbauer, John, et al.
Published: (2026)

SupraTok: Cross-Boundary Tokenization for Enhanced Language Model Performance
by: Tănase, Andrei-Valentin, et al.
Published: (2025)

Language Modeling with Learned Meta-Tokens
by: Shah, Alok N., et al.
Published: (2025)

Parallel Token Prediction for Language Models
by: Draxler, Felix, et al.
Published: (2025)

Self-Distillation for Multi-Token Prediction
by: Zhao, Guoliang, et al.
Published: (2026)

Understanding Likelihood Over-optimisation in Direct Alignment Algorithms
by: Shi, Zhengyan, et al.
Published: (2024)

Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models
by: Zhao, Siyan, et al.
Published: (2026)

On the Reasoning Abilities of Masked Diffusion Language Models
by: Svete, Anej, et al.
Published: (2025)

QA-Calibration of Language Model Confidence Scores
by: Manggala, Putra, et al.
Published: (2024)

Black-Box Detection of LLM-Generated Text Using Generalized Jensen-Shannon Divergence
by: Chen, Shuangyi, et al.
Published: (2025)

DiLM: Distilling Dataset into Language Model for Text-level Dataset Distillation
by: Maekawa, Aru, et al.
Published: (2024)

BanglaEmbed: Efficient Sentence Embedding Models for a Low-Resource Language Using Cross-Lingual Distillation Techniques
by: Kabir, Muhammad Rafsan, et al.
Published: (2024)

From Token to Token Pair: Efficient Prompt Compression for Large Language Models in Clinical Prediction
by: Zhu, Mingcheng, et al.
Published: (2026)

Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP
by: Remy, François, et al.
Published: (2024)

The Geometry of Tokens in Internal Representations of Large Language Models
by: Viswanathan, Karthik, et al.
Published: (2025)

RDBE: Reasoning Distillation-Based Evaluation Enhances Automatic Essay Scoring
by: Mohammadkhani, Ali Ghiasvand
Published: (2024)

Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions
by: Fang, Luyang, et al.
Published: (2025)

A Survey of On-Policy Distillation for Large Language Models
by: Song, Mingyang, et al.
Published: (2026)

Unsupervised Pretraining for Fact Verification by Language Model Distillation
by: Bazaga, Adrián, et al.
Published: (2023)

Adversarial Moment-Matching Distillation of Large Language Models
by: Jia, Chen
Published: (2024)

Temporal Tokenization Strategies for Event Sequence Modeling with Large Language Models
by: Liu, Zefang, et al.
Published: (2025)

Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs)
by: Rahman, Abrar, et al.
Published: (2024)

Investigating Automatic Scoring and Feedback using Large Language Models
by: Katuka, Gloria Ashiya, et al.
Published: (2024)