Saved in:
Bibliographic Details
Main Authors: Weissenbacher, Matthias, Agarwal, Rishabh, Kawahara, Yoshinobu
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.15025
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911928690933760
author Weissenbacher, Matthias
Agarwal, Rishabh
Kawahara, Yoshinobu
author_facet Weissenbacher, Matthias
Agarwal, Rishabh
Kawahara, Yoshinobu
contents An open challenge in reinforcement learning (RL) is the effective deployment of a trained policy to new or slightly different situations as well as semantically-similar environments. We introduce Symmetry-Invariant Transformer (SiT), a scalable vision transformer (ViT) that leverages both local and global data patterns in a self-supervised manner to improve generalisation. Central to our approach is Graph Symmetric Attention, which refines the traditional self-attention mechanism to preserve graph symmetries, resulting in invariant and equivariant latent representations. We showcase SiT's superior generalization over ViTs on MiniGrid and Procgen RL benchmarks, and its sample efficiency on Atari 100k and CIFAR10.
format Preprint
id arxiv_https___arxiv_org_abs_2406_15025
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle SiT: Symmetry-Invariant Transformers for Generalisation in Reinforcement Learning
Weissenbacher, Matthias
Agarwal, Rishabh
Kawahara, Yoshinobu
Machine Learning
An open challenge in reinforcement learning (RL) is the effective deployment of a trained policy to new or slightly different situations as well as semantically-similar environments. We introduce Symmetry-Invariant Transformer (SiT), a scalable vision transformer (ViT) that leverages both local and global data patterns in a self-supervised manner to improve generalisation. Central to our approach is Graph Symmetric Attention, which refines the traditional self-attention mechanism to preserve graph symmetries, resulting in invariant and equivariant latent representations. We showcase SiT's superior generalization over ViTs on MiniGrid and Procgen RL benchmarks, and its sample efficiency on Atari 100k and CIFAR10.
title SiT: Symmetry-Invariant Transformers for Generalisation in Reinforcement Learning
topic Machine Learning
url https://arxiv.org/abs/2406.15025