Saved in:
Bibliographic Details
Main Authors: Sanford, Clayton, Hsu, Daniel, Telgarsky, Matus
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.14332
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929474822471680
author Sanford, Clayton
Hsu, Daniel
Telgarsky, Matus
author_facet Sanford, Clayton
Hsu, Daniel
Telgarsky, Matus
contents A simple communication complexity argument proves that no one-layer transformer can solve the induction heads task unless its size is exponentially larger than the size sufficient for a two-layer transformer.
format Preprint
id arxiv_https___arxiv_org_abs_2408_14332
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle One-layer transformers fail to solve the induction heads task
Sanford, Clayton
Hsu, Daniel
Telgarsky, Matus
Machine Learning
A simple communication complexity argument proves that no one-layer transformer can solve the induction heads task unless its size is exponentially larger than the size sufficient for a two-layer transformer.
title One-layer transformers fail to solve the induction heads task
topic Machine Learning
url https://arxiv.org/abs/2408.14332