Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Feng, Lawrence, Ghosal, Gaurav R., Springer, Jacob Mitchell, Zhong, Ziqian, Raghunathan, Aditi
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2605.12705
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909038383464448
author	Feng, Lawrence Ghosal, Gaurav R. Springer, Jacob Mitchell Zhong, Ziqian Raghunathan, Aditi
author_facet	Feng, Lawrence Ghosal, Gaurav R. Springer, Jacob Mitchell Zhong, Ziqian Raghunathan, Aditi
contents	How can we train models whose post-trained capabilities survive subsequent fine-tuning? Rather than focusing on downstream interventions to mitigate forgetting of upstream capabilities, we study how upstream training choices - that is, the manner in which a capability is acquired - shape how robustly that capability is retained. We investigate this question in a controlled three-stage language-model pipeline: pretraining, post-training to acquire a target capability, and downstream fine-tuning on a new objective. Across 135M and 1B models, two post-training domains, and two downstream fine-tuning tasks, we find that immediate post-training performance does not reliably predict retention after subsequent fine-tuning: training recipes that look equivalent immediately after post-training can retain the target capability very differently after subsequent fine-tuning. In particular, early exposure - mixing post-training data into pretraining - consistently improves the frontier between retained upstream performance and downstream performance. In compute-matched experiments, where the target data must be allocated between pretraining and post-training, we find that the optimum lies at neither extreme. Together with our other empirical and theoretical findings, this supports the view that post-training drives immediate specialization while early exposure improves robustness to later forgetting. Replay and dropout, typically used to mitigate forgetting as it occurs during fine-tuning, provide complementary gains to early exposure when applied during post-training. Our findings suggest that robustness to subsequent fine-tuning should be treated as a first-class objective of upstream training, addressed preventatively through choices like early exposure rather than reactively during fine-tuning itself.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_12705
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Early Data Exposure Improves Robustness to Subsequent Fine-Tuning Feng, Lawrence Ghosal, Gaurav R. Springer, Jacob Mitchell Zhong, Ziqian Raghunathan, Aditi Machine Learning How can we train models whose post-trained capabilities survive subsequent fine-tuning? Rather than focusing on downstream interventions to mitigate forgetting of upstream capabilities, we study how upstream training choices - that is, the manner in which a capability is acquired - shape how robustly that capability is retained. We investigate this question in a controlled three-stage language-model pipeline: pretraining, post-training to acquire a target capability, and downstream fine-tuning on a new objective. Across 135M and 1B models, two post-training domains, and two downstream fine-tuning tasks, we find that immediate post-training performance does not reliably predict retention after subsequent fine-tuning: training recipes that look equivalent immediately after post-training can retain the target capability very differently after subsequent fine-tuning. In particular, early exposure - mixing post-training data into pretraining - consistently improves the frontier between retained upstream performance and downstream performance. In compute-matched experiments, where the target data must be allocated between pretraining and post-training, we find that the optimum lies at neither extreme. Together with our other empirical and theoretical findings, this supports the view that post-training drives immediate specialization while early exposure improves robustness to later forgetting. Replay and dropout, typically used to mitigate forgetting as it occurs during fine-tuning, provide complementary gains to early exposure when applied during post-training. Our findings suggest that robustness to subsequent fine-tuning should be treated as a first-class objective of upstream training, addressed preventatively through choices like early exposure rather than reactively during fine-tuning itself.
title	Early Data Exposure Improves Robustness to Subsequent Fine-Tuning
topic	Machine Learning
url	https://arxiv.org/abs/2605.12705

Similar Items