Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Abbasi, Amirreza, Hooshmand, Mohsen
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2512.18445
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915688286781440
author	Abbasi, Amirreza Hooshmand, Mohsen
author_facet	Abbasi, Amirreza Hooshmand, Mohsen
contents	Transformers are crucial across many AI fields, such as large language models, computer vision, and reinforcement learning. This prominence stems from the architecture's perceived universality and scalability compared to alternatives. This work examines the problem of universality in Transformers, reviews recent progress, including architectural refinements such as structural minimality and approximation rates, and surveys state-of-the-art advances that inform both theoretical and practical understanding. Our aim is to clarify what is currently known about Transformers expressiveness, separate robust guarantees from fragile ones, and identify key directions for future theoretical research.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_18445
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	On the Universality of Transformer Architectures; How Much Attention Is Enough? Abbasi, Amirreza Hooshmand, Mohsen Machine Learning Transformers are crucial across many AI fields, such as large language models, computer vision, and reinforcement learning. This prominence stems from the architecture's perceived universality and scalability compared to alternatives. This work examines the problem of universality in Transformers, reviews recent progress, including architectural refinements such as structural minimality and approximation rates, and surveys state-of-the-art advances that inform both theoretical and practical understanding. Our aim is to clarify what is currently known about Transformers expressiveness, separate robust guarantees from fragile ones, and identify key directions for future theoretical research.
title	On the Universality of Transformer Architectures; How Much Attention Is Enough?
topic	Machine Learning
url	https://arxiv.org/abs/2512.18445

Similar Items