Saved in:
Bibliographic Details
Main Authors: Cao, Hoang T. H., Trinh, Hai D. V., Quan, Tho, Truong, Lan V.
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.18564
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908900703338496
author Cao, Hoang T. H.
Trinh, Hai D. V.
Quan, Tho
Truong, Lan V.
author_facet Cao, Hoang T. H.
Trinh, Hai D. V.
Quan, Tho
Truong, Lan V.
contents Recent work has shown that Transformers can perform in-context learning for linear regression under restrictive assumptions, including i.i.d. data, Gaussian noise, and Gaussian regression coefficients. However, real-world data often violate these assumptions: the distributions of inputs, noise, and coefficients are typically unknown, non-Gaussian, and may exhibit dependency across the prompt. This raises a fundamental question: can Transformers learn effectively in-context under realistic distributional uncertainty? We study in-context learning for noisy linear regression under a broad range of distributional shifts, including non-Gaussian coefficients, heavy-tailed noise, and non-i.i.d. prompts. We compare Transformers against classical baselines that are optimal or suboptimal under the corresponding maximum-likelihood criteria. Across all settings, Transformers consistently match or outperform these baselines, demonstrating robust in-context adaptation beyond classical estimators.
format Preprint
id arxiv_https___arxiv_org_abs_2603_18564
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Transformers Learn Robust In-Context Regression under Distributional Uncertainty
Cao, Hoang T. H.
Trinh, Hai D. V.
Quan, Tho
Truong, Lan V.
Machine Learning
Artificial Intelligence
Recent work has shown that Transformers can perform in-context learning for linear regression under restrictive assumptions, including i.i.d. data, Gaussian noise, and Gaussian regression coefficients. However, real-world data often violate these assumptions: the distributions of inputs, noise, and coefficients are typically unknown, non-Gaussian, and may exhibit dependency across the prompt. This raises a fundamental question: can Transformers learn effectively in-context under realistic distributional uncertainty? We study in-context learning for noisy linear regression under a broad range of distributional shifts, including non-Gaussian coefficients, heavy-tailed noise, and non-i.i.d. prompts. We compare Transformers against classical baselines that are optimal or suboptimal under the corresponding maximum-likelihood criteria. Across all settings, Transformers consistently match or outperform these baselines, demonstrating robust in-context adaptation beyond classical estimators.
title Transformers Learn Robust In-Context Regression under Distributional Uncertainty
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2603.18564