Saved in:
| Main Authors: | Minixhofer, Benjamin, Murray, Tyler, Limisiewicz, Tomasz, Korhonen, Anna, Zettlemoyer, Luke, Smith, Noah A., Ponti, Edoardo M., Soldaini, Luca, Hofmann, Valentin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.15586 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models
by: Blevins, Terra, et al.
Published: (2024)
by: Blevins, Terra, et al.
Published: (2024)
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
by: Limisiewicz, Tomasz, et al.
Published: (2024)
by: Limisiewicz, Tomasz, et al.
Published: (2024)
Zero-Shot Tokenizer Transfer
by: Minixhofer, Benjamin, et al.
Published: (2024)
by: Minixhofer, Benjamin, et al.
Published: (2024)
Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching
by: Minixhofer, Benjamin, et al.
Published: (2025)
by: Minixhofer, Benjamin, et al.
Published: (2025)
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
by: Ahia, Orevaoghene, et al.
Published: (2024)
by: Ahia, Orevaoghene, et al.
Published: (2024)
Scaling Sparse Fine-Tuning to Large Language Models
by: Ansell, Alan, et al.
Published: (2024)
by: Ansell, Alan, et al.
Published: (2024)
Dual Debiasing: Remove Stereotypes and Keep Factual Gender for Fair Language Modeling and Translation
by: Limisiewicz, Tomasz, et al.
Published: (2025)
by: Limisiewicz, Tomasz, et al.
Published: (2025)
Bootstrapping Action-Grounded Visual Dynamics in Unified Vision-Language Models
by: Qiu, Yifu, et al.
Published: (2025)
by: Qiu, Yifu, et al.
Published: (2025)
Emergent Communication Pretraining for Few-Shot Machine Translation
by: Li, Yaoyiran, et al.
Published: (2020)
by: Li, Yaoyiran, et al.
Published: (2020)
Demystifying Prompts in Language Models via Perplexity Estimation
by: Gonen, Hila, et al.
Published: (2022)
by: Gonen, Hila, et al.
Published: (2022)
Fast Byte Latent Transformer
by: Kallini, Julie, et al.
Published: (2026)
by: Kallini, Julie, et al.
Published: (2026)
Spectral Editing of Activations for Large Language Model Alignment
by: Qiu, Yifu, et al.
Published: (2024)
by: Qiu, Yifu, et al.
Published: (2024)
Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models
by: Gonen, Hila, et al.
Published: (2024)
by: Gonen, Hila, et al.
Published: (2024)
Compute Optimal Tokenization
by: Limisiewicz, Tomasz, et al.
Published: (2026)
by: Limisiewicz, Tomasz, et al.
Published: (2026)
Debiasing Algorithm through Model Adaptation
by: Limisiewicz, Tomasz, et al.
Published: (2023)
by: Limisiewicz, Tomasz, et al.
Published: (2023)
Retrofitting Large Language Models with Dynamic Tokenization
by: Feher, Darius, et al.
Published: (2024)
by: Feher, Darius, et al.
Published: (2024)
Beyond Literal Token Overlap: Token Alignability for Multilinguality
by: Hämmerl, Katharina, et al.
Published: (2025)
by: Hämmerl, Katharina, et al.
Published: (2025)
Self-Improving World Modelling with Latent Actions
by: Qiu, Yifu, et al.
Published: (2026)
by: Qiu, Yifu, et al.
Published: (2026)
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
by: Min, Sewon, et al.
Published: (2023)
by: Min, Sewon, et al.
Published: (2023)
Evaluating Copyright Takedown Methods for Language Models
by: Wei, Boyi, et al.
Published: (2024)
by: Wei, Boyi, et al.
Published: (2024)
Navigating the Alignment-Calibration Trade-off: A Pareto-Superior Frontier via Model Merging
by: Hu, Tiancheng, et al.
Published: (2025)
by: Hu, Tiancheng, et al.
Published: (2025)
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models
by: Hu, Yushi, et al.
Published: (2024)
by: Hu, Yushi, et al.
Published: (2024)
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
by: Chang, Tyler A., et al.
Published: (2025)
by: Chang, Tyler A., et al.
Published: (2025)
Fine-tuning Large Language Models with Sequential Instructions
by: Hu, Hanxu, et al.
Published: (2024)
by: Hu, Hanxu, et al.
Published: (2024)
Quantifying Language Disparities in Multilingual Large Language Models
by: Hu, Songbo, et al.
Published: (2025)
by: Hu, Songbo, et al.
Published: (2025)
Comparing Hallucination Detection Metrics for Multilingual Generation
by: Kang, Haoqiang, et al.
Published: (2024)
by: Kang, Haoqiang, et al.
Published: (2024)
Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation
by: Miranda, Lester James V., et al.
Published: (2026)
by: Miranda, Lester James V., et al.
Published: (2026)
Cross-Lingual and Cross-Cultural Variation in Image Descriptions
by: Berger, Uri, et al.
Published: (2024)
by: Berger, Uri, et al.
Published: (2024)
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
by: Shi, Weijia, et al.
Published: (2024)
by: Shi, Weijia, et al.
Published: (2024)
Teaching Models to Understand (but not Generate) High-risk Data
by: Wang, Ryan, et al.
Published: (2025)
by: Wang, Ryan, et al.
Published: (2025)
Analyzing and Adapting Large Language Models for Few-Shot Multilingual NLU: Are We There Yet?
by: Razumovskaia, Evgeniia, et al.
Published: (2024)
by: Razumovskaia, Evgeniia, et al.
Published: (2024)
SuperBPE: Space Travel for Language Models
by: Liu, Alisa, et al.
Published: (2025)
by: Liu, Alisa, et al.
Published: (2025)
Cultural Learning-Based Culture Adaptation of Language Models
by: Liu, Chen Cecilia, et al.
Published: (2025)
by: Liu, Chen Cecilia, et al.
Published: (2025)
Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass
by: Chen, Tong, et al.
Published: (2024)
by: Chen, Tong, et al.
Published: (2024)
Paloma: A Benchmark for Evaluating Language Model Fit
by: Magnusson, Ian, et al.
Published: (2023)
by: Magnusson, Ian, et al.
Published: (2023)
Micro Language Models Enable Instant Responses
by: Cheng, Wen, et al.
Published: (2026)
by: Cheng, Wen, et al.
Published: (2026)
FlexOlmo: Open Language Models for Flexible Data Use
by: Shi, Weijia, et al.
Published: (2025)
by: Shi, Weijia, et al.
Published: (2025)
On Bilingual Lexicon Induction with Large Language Models
by: Li, Yaoyiran, et al.
Published: (2023)
by: Li, Yaoyiran, et al.
Published: (2023)
Olmix: A Framework for Data Mixing Throughout LM Development
by: Chen, Mayee F., et al.
Published: (2026)
by: Chen, Mayee F., et al.
Published: (2026)
Olmo Hybrid: From Theory to Practice and Back
by: Merrill, William, et al.
Published: (2026)
by: Merrill, William, et al.
Published: (2026)
Similar Items
-
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models
by: Blevins, Terra, et al.
Published: (2024) -
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
by: Limisiewicz, Tomasz, et al.
Published: (2024) -
Zero-Shot Tokenizer Transfer
by: Minixhofer, Benjamin, et al.
Published: (2024) -
Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching
by: Minixhofer, Benjamin, et al.
Published: (2025) -
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
by: Ahia, Orevaoghene, et al.
Published: (2024)