Saved in:
Bibliographic Details
Main Authors: Chattaraj, Sourav, Raj, Kanak
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.00270
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911530221568000
author Chattaraj, Sourav
Raj, Kanak
author_facet Chattaraj, Sourav
Raj, Kanak
contents When large language models encounter conflicting information in context, which memories survive -- early or recent? We adapt classical interference paradigms from cognitive psychology to answer this question, testing 39 LLMs across diverse architectures and scales. Every model shows the same pattern: proactive interference (PI) dominates retroactive interference (RI) universally (Cohen's d = 1.73, p < 0.0001), meaning early encodings are protected at the cost of recent information -- the opposite of human memory, where RI typically dominates. Three findings indicate that RI and PI reflect separate memory mechanisms. RI and PI are uncorrelated (R^2 = 0.044), rejecting a unified "memory capacity." Model size predicts RI resistance (R^2 = 0.49) but not PI (R^2 = 0.06, n.s.) -- only RI is capacity-dependent. And error analysis reveals distinct failure modes: RI failures are passive retrieval failures (51%), while PI failures show active primacy intrusion (56%); both show <1% hallucination. These patterns parallel the consolidation-retrieval distinction in cognitive science, suggesting that transformer attention creates a primacy bias with direct implications for interference-heavy applications.
format Preprint
id arxiv_https___arxiv_org_abs_2603_00270
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Transformers Remember First, Forget Last: Dual-Process Interference in LLMs
Chattaraj, Sourav
Raj, Kanak
Information Retrieval
Artificial Intelligence
Computation and Language
When large language models encounter conflicting information in context, which memories survive -- early or recent? We adapt classical interference paradigms from cognitive psychology to answer this question, testing 39 LLMs across diverse architectures and scales. Every model shows the same pattern: proactive interference (PI) dominates retroactive interference (RI) universally (Cohen's d = 1.73, p < 0.0001), meaning early encodings are protected at the cost of recent information -- the opposite of human memory, where RI typically dominates. Three findings indicate that RI and PI reflect separate memory mechanisms. RI and PI are uncorrelated (R^2 = 0.044), rejecting a unified "memory capacity." Model size predicts RI resistance (R^2 = 0.49) but not PI (R^2 = 0.06, n.s.) -- only RI is capacity-dependent. And error analysis reveals distinct failure modes: RI failures are passive retrieval failures (51%), while PI failures show active primacy intrusion (56%); both show <1% hallucination. These patterns parallel the consolidation-retrieval distinction in cognitive science, suggesting that transformer attention creates a primacy bias with direct implications for interference-heavy applications.
title Transformers Remember First, Forget Last: Dual-Process Interference in LLMs
topic Information Retrieval
Artificial Intelligence
Computation and Language
url https://arxiv.org/abs/2603.00270