:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Niklaus, Joel, Yamaguchi, Atsuki, Štefánik, Michal, Penedo, Guilherme, Kydlíček, Hynek, Bakouch, Elie, Tunstall, Lewis, Beeching, Edward Emanuel, Frere, Thibaud, Raffel, Colin, von Werra, Leandro, Wolf, Thomas
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2604.13977
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
by: Penedo, Guilherme, et al.
Published: (2024)

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
by: Penedo, Guilherme, et al.
Published: (2025)

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
by: Allal, Loubna Ben, et al.
Published: (2025)

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
by: Hägele, Alexander, et al.
Published: (2024)

FineVision: Open Data Is All You Need
by: Wiedmann, Luis, et al.
Published: (2025)

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
by: Qu, Yuxiao, et al.
Published: (2025)

SmolVLM: Redefining small and efficient multimodal models
by: Marafioti, Andrés, et al.
Published: (2025)

Position: The Most Expensive Part of an LLM should be its Training Data
by: Kandpal, Nikhil, et al.
Published: (2025)

The Sustainability Gap in Robotics: A Large-Scale Survey of Sustainability Awareness in 50,000 Research Articles
by: Skuric, Antun, et al.
Published: (2026)

Efficiently Estimating Data Efficiency for Language Model Fine-tuning
by: Je, Gyung Hyun, et al.
Published: (2025)

How Can We Effectively Expand the Vocabulary of LLMs with 0.01GB of Target Language Text?
by: Yamaguchi, Atsuki, et al.
Published: (2024)

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
by: Patel, Ajay, et al.
Published: (2024)

DABstep: Data Agent Benchmark for Multi-step Reasoning
by: Egg, Alex, et al.
Published: (2025)

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems
by: LM-Provers, et al.
Published: (2026)

Adapting Chat Language Models Using Only Target Unlabeled Language Data
by: Yamaguchi, Atsuki, et al.
Published: (2024)

Uncovering Model Processing Strategies with Non-Negative Per-Example Fisher Factorization
by: Matena, Michael, et al.
Published: (2023)

Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
by: Deng, Haikang, et al.
Published: (2023)

Enhancing Training Data Attribution with Representational Optimization
by: Sun, Weiwei, et al.
Published: (2025)

Concept-aware Data Construction Improves In-context Learning of Language Models
by: Štefánik, Michal, et al.
Published: (2024)

An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Language Model Inference
by: Yamaguchi, Atsuki, et al.
Published: (2024)

Enhancing Linguistic Competence of Language Models through Pre-training with Language Learning Tasks
by: Yamaguchi, Atsuki, et al.
Published: (2026)

Scaling Data-Constrained Language Models
by: Muennighoff, Niklas, et al.
Published: (2023)

Verulamium Excavations
by: Frere, Sheppard
Published: (2020)

Verulamium Excavations. Volume II
by: Shephard, Frere
Published: (2021)

AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution
by: Liu, Fengyuan, et al.
Published: (2024)

Merging by Matching Models in Task Parameter Subspaces
by: Tam, Derek, et al.
Published: (2023)

Soft Merging of Experts with Adaptive Routing
by: Muqeeth, Mohammed, et al.
Published: (2023)

tt386/tuning-selection: Code release for "Tuning spatial distributions of selection pressure to suppress emergence of resistance"
by: Thomas Tunstall
Published: (2026)

How Social Network Structure Impacts the Ability of Zealots to Promote Weak Opinions
by: Tunstall, Thomas
Published: (2024)

Model Merging via Data-Free Covariance Estimation
by: Hameed, Marawan Gamal Abdel, et al.
Published: (2026)

Obsidian : 3D documentation
by: Werra, Dagmara H.
Published: (2026)

Obsidian : 3D documentation
by: Werra, Dagmara H.
Published: (2026)

Obsidian : 3D documentation
by: Werra, Dagmara H.
Published: (2026)

Świeciechów flint : 3D documentation
by: Werra, Dagmara H.
Published: (2026)

Obsidian : 3D documentation
by: Werra, Dagmara H.
Published: (2026)

Świeciechów flint : 3D documentation
by: Werra, Dagmara H.
Published: (2026)

Obsidian : 3D documentation
by: Werra, Dagmara H.
Published: (2026)

Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates
by: Yamaguchi, Atsuki, et al.
Published: (2025)

Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus
by: Morishita, Terufumi, et al.
Published: (2024)

Circulación de objetos, personas y saberes técnicos en el humedal del río Salado Bonaerense, Argentina
by: María Magdalena Frere
Published: (2022)