Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Karim, Ahmed, Sheaib, Fatima, Khamis, Zein, Chlon, Maggie, Awada, Jad, Chlon, Leon
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.19239
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915811300474880
author	Karim, Ahmed Sheaib, Fatima Khamis, Zein Chlon, Maggie Awada, Jad Chlon, Leon
author_facet	Karim, Ahmed Sheaib, Fatima Khamis, Zein Chlon, Maggie Awada, Jad Chlon, Leon
contents	Large language models can follow complex procedures yet fail at a seemingly trivial final step: reporting a value they themselves computed moments earlier. We study this phenomenon as \emph{procedural hallucination}: failure to execute a verifiable, prompt-grounded specification even when the correct value is present in context. In long-context binding tasks with a known single-token candidate set, we find that many errors are readout-stage routing failures. Specifically, failures decompose into Stage~2A (gating) errors, where the model does not enter answer mode, and Stage~2B (binding) errors, where it enters answer mode but selects the wrong candidate (often due to recency bias). In the hard regime, Stage~2B accounts for most errors across model families in our tasks (Table~1). On Stage~2B error trials, a linear probe on the final-layer residual stream recovers the correct value far above chance (e.g., 74\% vs.\ 2\% on Qwen2.5-3B; Table~2), indicating that the answer is encoded but not used. We formalize ``present but not used'' via available vs.\ used mutual information and pseudo-prior interventions, yielding output-computable diagnostics and information-budget certificates. Finally, an oracle checkpointing intervention that restates the true binding near the query can nearly eliminate Stage~2B failures at long distance (e.g., Qwen2.5-3B $0/400 \rightarrow 399/400$ at $k = 1024$; Table~8).
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_19239
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Attention Deficits in Language Models: Causal Explanations for Procedural Hallucinations Karim, Ahmed Sheaib, Fatima Khamis, Zein Chlon, Maggie Awada, Jad Chlon, Leon Machine Learning Large language models can follow complex procedures yet fail at a seemingly trivial final step: reporting a value they themselves computed moments earlier. We study this phenomenon as \emph{procedural hallucination}: failure to execute a verifiable, prompt-grounded specification even when the correct value is present in context. In long-context binding tasks with a known single-token candidate set, we find that many errors are readout-stage routing failures. Specifically, failures decompose into Stage~2A (gating) errors, where the model does not enter answer mode, and Stage~2B (binding) errors, where it enters answer mode but selects the wrong candidate (often due to recency bias). In the hard regime, Stage~2B accounts for most errors across model families in our tasks (Table~1). On Stage~2B error trials, a linear probe on the final-layer residual stream recovers the correct value far above chance (e.g., 74\% vs.\ 2\% on Qwen2.5-3B; Table~2), indicating that the answer is encoded but not used. We formalize ``present but not used'' via available vs.\ used mutual information and pseudo-prior interventions, yielding output-computable diagnostics and information-budget certificates. Finally, an oracle checkpointing intervention that restates the true binding near the query can nearly eliminate Stage~2B failures at long distance (e.g., Qwen2.5-3B $0/400 \rightarrow 399/400$ at $k = 1024$; Table~8).
title	Attention Deficits in Language Models: Causal Explanations for Procedural Hallucinations
topic	Machine Learning
url	https://arxiv.org/abs/2602.19239

Similar Items