Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.15025 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866911928690933760 |
|---|---|
| author | Weissenbacher, Matthias Agarwal, Rishabh Kawahara, Yoshinobu |
| author_facet | Weissenbacher, Matthias Agarwal, Rishabh Kawahara, Yoshinobu |
| contents | An open challenge in reinforcement learning (RL) is the effective deployment of a trained policy to new or slightly different situations as well as semantically-similar environments. We introduce Symmetry-Invariant Transformer (SiT), a scalable vision transformer (ViT) that leverages both local and global data patterns in a self-supervised manner to improve generalisation. Central to our approach is Graph Symmetric Attention, which refines the traditional self-attention mechanism to preserve graph symmetries, resulting in invariant and equivariant latent representations. We showcase SiT's superior generalization over ViTs on MiniGrid and Procgen RL benchmarks, and its sample efficiency on Atari 100k and CIFAR10. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2406_15025 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | SiT: Symmetry-Invariant Transformers for Generalisation in Reinforcement Learning Weissenbacher, Matthias Agarwal, Rishabh Kawahara, Yoshinobu Machine Learning An open challenge in reinforcement learning (RL) is the effective deployment of a trained policy to new or slightly different situations as well as semantically-similar environments. We introduce Symmetry-Invariant Transformer (SiT), a scalable vision transformer (ViT) that leverages both local and global data patterns in a self-supervised manner to improve generalisation. Central to our approach is Graph Symmetric Attention, which refines the traditional self-attention mechanism to preserve graph symmetries, resulting in invariant and equivariant latent representations. We showcase SiT's superior generalization over ViTs on MiniGrid and Procgen RL benchmarks, and its sample efficiency on Atari 100k and CIFAR10. |
| title | SiT: Symmetry-Invariant Transformers for Generalisation in Reinforcement Learning |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2406.15025 |