Saved in:
| Main Authors: | Phan, Buu, Khisti, Ashish, Ullrich, Karen |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.14954 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Understanding and Mitigating Tokenization Bias in Language Models
by: Phan, Buu, et al.
Published: (2024)
by: Phan, Buu, et al.
Published: (2024)
Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles
by: Phan, Buu, et al.
Published: (2024)
by: Phan, Buu, et al.
Published: (2024)
Channel Simulation and Distributed Compression with Ensemble Rejection Sampling
by: Phan, Buu, et al.
Published: (2025)
by: Phan, Buu, et al.
Published: (2025)
List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression
by: Rowan, Joseph, et al.
Published: (2025)
by: Rowan, Joseph, et al.
Published: (2025)
Multi-Marginal Couplings for Metropolis-Hastings
by: Phan, Buu, et al.
Published: (2026)
by: Phan, Buu, et al.
Published: (2026)
X-Token: Projection-Guided Cross-Tokenizer Knowledge Distillation
by: Sreenivas, Sharath Turuvekere, et al.
Published: (2026)
by: Sreenivas, Sharath Turuvekere, et al.
Published: (2026)
Importance Matching Lemma for Lossy Compression with Side Information
by: Phan, Buu, et al.
Published: (2024)
by: Phan, Buu, et al.
Published: (2024)
One-Shot Broadcast Joint Source-Channel Coding with Codebook Diversity
by: Rowan, Joseph, et al.
Published: (2026)
by: Rowan, Joseph, et al.
Published: (2026)
On Self-Adaptive Perception Loss Function for Sequential Lossy Compression
by: Salehkalaibar, Sadaf, et al.
Published: (2025)
by: Salehkalaibar, Sadaf, et al.
Published: (2025)
Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation
by: Phan, Phuc, et al.
Published: (2024)
by: Phan, Phuc, et al.
Published: (2024)
AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation
by: Zhang, Songming, et al.
Published: (2025)
by: Zhang, Songming, et al.
Published: (2025)
Multi-Draft Speculative Sampling: Canonical Decomposition and Theoretical Limits
by: Khisti, Ashish, et al.
Published: (2024)
by: Khisti, Ashish, et al.
Published: (2024)
Token Distillation: Attention-aware Input Embeddings For New Tokens
by: Dobler, Konstantin, et al.
Published: (2025)
by: Dobler, Konstantin, et al.
Published: (2025)
From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers
by: Su, Jingtong, et al.
Published: (2025)
by: Su, Jingtong, et al.
Published: (2025)
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
by: Su, Jingtong, et al.
Published: (2024)
by: Su, Jingtong, et al.
Published: (2024)
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
by: Huo, Mingjia, et al.
Published: (2024)
by: Huo, Mingjia, et al.
Published: (2024)
End-To-End Causal Effect Estimation from Unstructured Natural Language Data
by: Dhawan, Nikita, et al.
Published: (2024)
by: Dhawan, Nikita, et al.
Published: (2024)
Multi-Token Prediction via Self-Distillation
by: Kirchenbauer, John, et al.
Published: (2026)
by: Kirchenbauer, John, et al.
Published: (2026)
SupraTok: Cross-Boundary Tokenization for Enhanced Language Model Performance
by: Tănase, Andrei-Valentin, et al.
Published: (2025)
by: Tănase, Andrei-Valentin, et al.
Published: (2025)
Language Modeling with Learned Meta-Tokens
by: Shah, Alok N., et al.
Published: (2025)
by: Shah, Alok N., et al.
Published: (2025)
Parallel Token Prediction for Language Models
by: Draxler, Felix, et al.
Published: (2025)
by: Draxler, Felix, et al.
Published: (2025)
Self-Distillation for Multi-Token Prediction
by: Zhao, Guoliang, et al.
Published: (2026)
by: Zhao, Guoliang, et al.
Published: (2026)
Understanding Likelihood Over-optimisation in Direct Alignment Algorithms
by: Shi, Zhengyan, et al.
Published: (2024)
by: Shi, Zhengyan, et al.
Published: (2024)
Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models
by: Zhao, Siyan, et al.
Published: (2026)
by: Zhao, Siyan, et al.
Published: (2026)
On the Reasoning Abilities of Masked Diffusion Language Models
by: Svete, Anej, et al.
Published: (2025)
by: Svete, Anej, et al.
Published: (2025)
QA-Calibration of Language Model Confidence Scores
by: Manggala, Putra, et al.
Published: (2024)
by: Manggala, Putra, et al.
Published: (2024)
Black-Box Detection of LLM-Generated Text Using Generalized Jensen-Shannon Divergence
by: Chen, Shuangyi, et al.
Published: (2025)
by: Chen, Shuangyi, et al.
Published: (2025)
DiLM: Distilling Dataset into Language Model for Text-level Dataset Distillation
by: Maekawa, Aru, et al.
Published: (2024)
by: Maekawa, Aru, et al.
Published: (2024)
BanglaEmbed: Efficient Sentence Embedding Models for a Low-Resource Language Using Cross-Lingual Distillation Techniques
by: Kabir, Muhammad Rafsan, et al.
Published: (2024)
by: Kabir, Muhammad Rafsan, et al.
Published: (2024)
From Token to Token Pair: Efficient Prompt Compression for Large Language Models in Clinical Prediction
by: Zhu, Mingcheng, et al.
Published: (2026)
by: Zhu, Mingcheng, et al.
Published: (2026)
Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP
by: Remy, François, et al.
Published: (2024)
by: Remy, François, et al.
Published: (2024)
The Geometry of Tokens in Internal Representations of Large Language Models
by: Viswanathan, Karthik, et al.
Published: (2025)
by: Viswanathan, Karthik, et al.
Published: (2025)
RDBE: Reasoning Distillation-Based Evaluation Enhances Automatic Essay Scoring
by: Mohammadkhani, Ali Ghiasvand
Published: (2024)
by: Mohammadkhani, Ali Ghiasvand
Published: (2024)
Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions
by: Fang, Luyang, et al.
Published: (2025)
by: Fang, Luyang, et al.
Published: (2025)
A Survey of On-Policy Distillation for Large Language Models
by: Song, Mingyang, et al.
Published: (2026)
by: Song, Mingyang, et al.
Published: (2026)
Unsupervised Pretraining for Fact Verification by Language Model Distillation
by: Bazaga, Adrián, et al.
Published: (2023)
by: Bazaga, Adrián, et al.
Published: (2023)
Adversarial Moment-Matching Distillation of Large Language Models
by: Jia, Chen
Published: (2024)
by: Jia, Chen
Published: (2024)
Temporal Tokenization Strategies for Event Sequence Modeling with Large Language Models
by: Liu, Zefang, et al.
Published: (2025)
by: Liu, Zefang, et al.
Published: (2025)
Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs)
by: Rahman, Abrar, et al.
Published: (2024)
by: Rahman, Abrar, et al.
Published: (2024)
Investigating Automatic Scoring and Feedback using Large Language Models
by: Katuka, Gloria Ashiya, et al.
Published: (2024)
by: Katuka, Gloria Ashiya, et al.
Published: (2024)
Similar Items
-
Understanding and Mitigating Tokenization Bias in Language Models
by: Phan, Buu, et al.
Published: (2024) -
Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles
by: Phan, Buu, et al.
Published: (2024) -
Channel Simulation and Distributed Compression with Ensemble Rejection Sampling
by: Phan, Buu, et al.
Published: (2025) -
List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression
by: Rowan, Joseph, et al.
Published: (2025) -
Multi-Marginal Couplings for Metropolis-Hastings
by: Phan, Buu, et al.
Published: (2026)