Saved in:
| Main Authors: | Kandpal, Nikhil, Lester, Brian, Raffel, Colin, Majstorovic, Sebastian, Biderman, Stella, Abbasi, Baber, Soldaini, Luca, Shippole, Enrico, Cooper, A. Feder, Skowron, Aviya, Kirchenbauer, John, Longpre, Shayne, Sutawika, Lintang, Albalak, Alon, Xu, Zhenlin, Penedo, Guilherme, Allal, Loubna Ben, Bakouch, Elie, Pressman, John David, Fan, Honglu, Stander, Dashiell, Song, Guangyu, Gokaslan, Aaron, Goldstein, Tom, Bartoldson, Brian R., Kailkhura, Bhavya, Murray, Tyler |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.05209 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Grokking Group Multiplication with Cosets
by: Stander, Dashiell, et al.
Published: (2023)
by: Stander, Dashiell, et al.
Published: (2023)
Double Visual Defense: Adversarial Pre-training and Instruction Tuning for Improving Vision-Language Model Robustness
by: Wang, Zeyu, et al.
Published: (2025)
by: Wang, Zeyu, et al.
Published: (2025)
Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies
by: Bartoldson, Brian R., et al.
Published: (2024)
by: Bartoldson, Brian R., et al.
Published: (2024)
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
by: Hägele, Alexander, et al.
Published: (2024)
by: Hägele, Alexander, et al.
Published: (2024)
Position: The Most Expensive Part of an LLM should be its Training Data
by: Kandpal, Nikhil, et al.
Published: (2025)
by: Kandpal, Nikhil, et al.
Published: (2025)
Gained in Translation: Privileged Pairwise Judges Enhance Multilingual Reasoning
by: Sutawika, Lintang, et al.
Published: (2026)
by: Sutawika, Lintang, et al.
Published: (2026)
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
by: Geiping, Jonas, et al.
Published: (2025)
by: Geiping, Jonas, et al.
Published: (2025)
Multi-Token Prediction via Self-Distillation
by: Kirchenbauer, John, et al.
Published: (2026)
by: Kirchenbauer, John, et al.
Published: (2026)
Get RICH or Die Scaling: Profitably Trading Inference Compute for Robustness
by: McDonald, Tavish, et al.
Published: (2025)
by: McDonald, Tavish, et al.
Published: (2025)
YaRN: Efficient Context Window Extension of Large Language Models
by: Peng, Bowen, et al.
Published: (2023)
by: Peng, Bowen, et al.
Published: (2023)
AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution
by: Liu, Fengyuan, et al.
Published: (2024)
by: Liu, Fengyuan, et al.
Published: (2024)
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
by: McLeish, Sean, et al.
Published: (2025)
by: McLeish, Sean, et al.
Published: (2025)
Transformers Can Do Arithmetic with the Right Embeddings
by: McLeish, Sean, et al.
Published: (2024)
by: McLeish, Sean, et al.
Published: (2024)
Act Only When It Pays: Efficient Reinforcement Learning for LLM Reasoning via Selective Rollouts
by: Zheng, Haizhong, et al.
Published: (2025)
by: Zheng, Haizhong, et al.
Published: (2025)
Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion
by: Christopher, Jacob K, et al.
Published: (2024)
by: Christopher, Jacob K, et al.
Published: (2024)
ELFS: Label-Free Coreset Selection with Proxy Training Dynamics
by: Zheng, Haizhong, et al.
Published: (2024)
by: Zheng, Haizhong, et al.
Published: (2024)
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
by: Penedo, Guilherme, et al.
Published: (2024)
by: Penedo, Guilherme, et al.
Published: (2024)
AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security
by: Cai, Zikui, et al.
Published: (2025)
by: Cai, Zikui, et al.
Published: (2025)
The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources
by: Longpre, Shayne, et al.
Published: (2024)
by: Longpre, Shayne, et al.
Published: (2024)
Enhancing Training Data Attribution with Representational Optimization
by: Sun, Weiwei, et al.
Published: (2025)
by: Sun, Weiwei, et al.
Published: (2025)
Self-Directed Synthetic Dialogues and Revisions Technical Report
by: Lambert, Nathan, et al.
Published: (2024)
by: Lambert, Nathan, et al.
Published: (2024)
Scaling Self-Supervised Representation Learning for Symbolic Piano Performance
by: Bradshaw, Louis, et al.
Published: (2025)
by: Bradshaw, Louis, et al.
Published: (2025)
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
by: Allal, Loubna Ben, et al.
Published: (2025)
by: Allal, Loubna Ben, et al.
Published: (2025)
How Can We Synthesize High-Quality Pretraining Data? A Systematic Study of Prompt Design, Generator Model, and Source Data
by: Niklaus, Joel, et al.
Published: (2026)
by: Niklaus, Joel, et al.
Published: (2026)
STAR-1: Safer Alignment of Reasoning LLMs with 1K Data
by: Wang, Zijun, et al.
Published: (2025)
by: Wang, Zijun, et al.
Published: (2025)
Literature of the americas in the making: U.S. writers and translation in sur, 1931-1944
by: Gorica Majstorovic
Published: (2013)
by: Gorica Majstorovic
Published: (2013)
The German‐Soviet War: Combat, Occupation, and Legacies by JeffRutherford and RobertvonMaier, eds. Ithaca: Cornell University Press, 2025. 600 pp. $59.95. ISBN 978‐1‐5017‐8108‐7
by: Vojin Majstorovic
Published: (2026)
by: Vojin Majstorovic
Published: (2026)
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training
by: Bartoldson, Brian, et al.
Published: (2025)
by: Bartoldson, Brian, et al.
Published: (2025)
EL YESO DE ZANCOLLI EN EL TRATAMIENTO DE LAS LESIONES TRAUMATICAS DE LA MANO EN NIÑOS
by: Dashiell Cañizares Betancourt
Published: (2006)
by: Dashiell Cañizares Betancourt
Published: (2006)
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages
by: Yue, Xiang, et al.
Published: (2024)
by: Yue, Xiang, et al.
Published: (2024)
Lessons from the Trenches on Reproducible Evaluation of Language Models
by: Biderman, Stella, et al.
Published: (2024)
by: Biderman, Stella, et al.
Published: (2024)
A Survey on Data Selection for Language Models
by: Albalak, Alon, et al.
Published: (2024)
by: Albalak, Alon, et al.
Published: (2024)
ENTERPRISE ARCHITECTURE AS AN APPROACH TO THE DEVELOPMENT OF INFORMATION SYSTEMS
by: Milosav N. Majstorović
Published: (2018)
by: Milosav N. Majstorović
Published: (2018)
BUSINESS AND IT ALIGNMENT
by: Milosav N. Majstorović
Published: (2016)
by: Milosav N. Majstorović
Published: (2016)
Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions
by: Liu, Emmy, et al.
Published: (2025)
by: Liu, Emmy, et al.
Published: (2025)
Ingeniería del software : un enfoque pr ctico / Roger S. Pressman ; traductor Jesús Elmer Murrieta Murrieta, Eloy Pineda Rojas, Víctor Campos Olguín
by: Pressman, Roger S
Published: (2005)
by: Pressman, Roger S
Published: (2005)
Ingeniería del software : un enfoque pr ctico / Roger S. Pressman ; traductor, Víctor Campos Olguín, Javíer Enríquez Brito
by: Pressman, Roger S
Published: (2010)
by: Pressman, Roger S
Published: (2010)
Israeli Unilateralism and Israeli—Palestinian Relations, 2001-2006 / Jeremy Pressman
by: Pressman, Jeremy
Published: (2001)
by: Pressman, Jeremy
Published: (2001)
Ingeniería del software : un enfoque pr ctico / Roger S. Pressman ; traductor José María Troya, Luis Hern ndez Yañez
by: Pressman, Roger S
Published: (1988)
by: Pressman, Roger S
Published: (1988)
Software engineering; a practitioner's approach / Roger S. Pressman
by: Pressman, Roger S
Published: (1982)
by: Pressman, Roger S
Published: (1982)
Similar Items
-
Grokking Group Multiplication with Cosets
by: Stander, Dashiell, et al.
Published: (2023) -
Double Visual Defense: Adversarial Pre-training and Instruction Tuning for Improving Vision-Language Model Robustness
by: Wang, Zeyu, et al.
Published: (2025) -
Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies
by: Bartoldson, Brian R., et al.
Published: (2024) -
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
by: Hägele, Alexander, et al.
Published: (2024) -
Position: The Most Expensive Part of an LLM should be its Training Data
by: Kandpal, Nikhil, et al.
Published: (2025)