Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Malu, Deeptanshu, Malu, Deevyanshu, Nemiwal, Aditya, Sarawagi, Sunita
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2604.01601
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908933243797504
author	Malu, Deeptanshu Malu, Deevyanshu Nemiwal, Aditya Sarawagi, Sunita
author_facet	Malu, Deeptanshu Malu, Deevyanshu Nemiwal, Aditya Sarawagi, Sunita
contents	We investigate training strategies that co-develop in-context learning (ICL) and in-weights learning (IWL), and the ability to switch between them based on context relevance. Although current LLMs exhibit both modes, standard task-specific fine-tuning often erodes ICL, motivating IC-Train - fine-tuning with in-context examples. Prior work has shown that emergence of ICL after IC-Train depends on factors such as task diversity and training duration. In this paper we show that the similarity structure between target inputs and context examples also plays an important role. Random context leads to loss of ICL and IWL dominance, while only similar examples in context causes ICL to degenerate to copying labels without regard to relevance. To address this, we propose a simple Contrastive-Context which enforces two types of contrasts: (1) mix of similar and random examples within a context to evolve a correct form of ICL, and (2) varying grades of similarity across contexts to evolve ICL-IWL mixtures. We present insights on the importance of such contrast with theoretical analysis of a minimal model. We validate with extensive empirical evaluation on four LLMs and several tasks. Diagnostic probes confirm that contrasted contexts yield stable ICL-IWL mixtures, avoiding collapse into pure ICL, IWL, or copying.
format	Preprint
id	arxiv_https___arxiv_org_abs_2604_01601
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Training In-Context and In-Weights Mixtures Via Contrastive Context Sampling Malu, Deeptanshu Malu, Deevyanshu Nemiwal, Aditya Sarawagi, Sunita Machine Learning We investigate training strategies that co-develop in-context learning (ICL) and in-weights learning (IWL), and the ability to switch between them based on context relevance. Although current LLMs exhibit both modes, standard task-specific fine-tuning often erodes ICL, motivating IC-Train - fine-tuning with in-context examples. Prior work has shown that emergence of ICL after IC-Train depends on factors such as task diversity and training duration. In this paper we show that the similarity structure between target inputs and context examples also plays an important role. Random context leads to loss of ICL and IWL dominance, while only similar examples in context causes ICL to degenerate to copying labels without regard to relevance. To address this, we propose a simple Contrastive-Context which enforces two types of contrasts: (1) mix of similar and random examples within a context to evolve a correct form of ICL, and (2) varying grades of similarity across contexts to evolve ICL-IWL mixtures. We present insights on the importance of such contrast with theoretical analysis of a minimal model. We validate with extensive empirical evaluation on four LLMs and several tasks. Diagnostic probes confirm that contrasted contexts yield stable ICL-IWL mixtures, avoiding collapse into pure ICL, IWL, or copying.
title	Training In-Context and In-Weights Mixtures Via Contrastive Context Sampling
topic	Machine Learning
url	https://arxiv.org/abs/2604.01601

Similar Items