Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.18564 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866908900703338496 |
|---|---|
| author | Cao, Hoang T. H. Trinh, Hai D. V. Quan, Tho Truong, Lan V. |
| author_facet | Cao, Hoang T. H. Trinh, Hai D. V. Quan, Tho Truong, Lan V. |
| contents | Recent work has shown that Transformers can perform in-context learning for linear regression under restrictive assumptions, including i.i.d. data, Gaussian noise, and Gaussian regression coefficients. However, real-world data often violate these assumptions: the distributions of inputs, noise, and coefficients are typically unknown, non-Gaussian, and may exhibit dependency across the prompt. This raises a fundamental question: can Transformers learn effectively in-context under realistic distributional uncertainty? We study in-context learning for noisy linear regression under a broad range of distributional shifts, including non-Gaussian coefficients, heavy-tailed noise, and non-i.i.d. prompts. We compare Transformers against classical baselines that are optimal or suboptimal under the corresponding maximum-likelihood criteria. Across all settings, Transformers consistently match or outperform these baselines, demonstrating robust in-context adaptation beyond classical estimators. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2603_18564 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Transformers Learn Robust In-Context Regression under Distributional Uncertainty Cao, Hoang T. H. Trinh, Hai D. V. Quan, Tho Truong, Lan V. Machine Learning Artificial Intelligence Recent work has shown that Transformers can perform in-context learning for linear regression under restrictive assumptions, including i.i.d. data, Gaussian noise, and Gaussian regression coefficients. However, real-world data often violate these assumptions: the distributions of inputs, noise, and coefficients are typically unknown, non-Gaussian, and may exhibit dependency across the prompt. This raises a fundamental question: can Transformers learn effectively in-context under realistic distributional uncertainty? We study in-context learning for noisy linear regression under a broad range of distributional shifts, including non-Gaussian coefficients, heavy-tailed noise, and non-i.i.d. prompts. We compare Transformers against classical baselines that are optimal or suboptimal under the corresponding maximum-likelihood criteria. Across all settings, Transformers consistently match or outperform these baselines, demonstrating robust in-context adaptation beyond classical estimators. |
| title | Transformers Learn Robust In-Context Regression under Distributional Uncertainty |
| topic | Machine Learning Artificial Intelligence |
| url | https://arxiv.org/abs/2603.18564 |