Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.01335 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917824985825280 |
|---|---|
| author | Rašajski, Nemanja Trivedi, Chintan Makantasis, Konstantinos Liapis, Antonios Yannakakis, Georgios N. |
| author_facet | Rašajski, Nemanja Trivedi, Chintan Makantasis, Konstantinos Liapis, Antonios Yannakakis, Georgios N. |
| contents | Domain randomisation enhances the transferability of vision models across visually distinct domains with similar content. However, current methods heavily depend on intricate simulation engines, hampering feasibility and scalability. This paper introduces BehAVE, a video understanding framework that utilises existing commercial video games for domain randomisation without accessing their simulation engines. BehAVE taps into the visual diversity of video games for randomisation and uses textual descriptions of player actions to align videos with similar content. We evaluate BehAVE across 25 first-person shooter (FPS) games using various video and text foundation models, demonstrating its robustness in domain randomisation. BehAVE effectively aligns player behavioural patterns and achieves zero-shot transfer to multiple unseen FPS games when trained on just one game. In a more challenging scenario, BehAVE enhances the zero-shot transferability of foundation models to unseen FPS games, even when trained on a game of a different genre, with improvements of up to 22%. BehAVE is available online at https://github.com/nrasajski/BehAVE. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2402_01335 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | BehAVE: Behaviour Alignment of Video Game Encodings Rašajski, Nemanja Trivedi, Chintan Makantasis, Konstantinos Liapis, Antonios Yannakakis, Georgios N. Computer Vision and Pattern Recognition Artificial Intelligence Domain randomisation enhances the transferability of vision models across visually distinct domains with similar content. However, current methods heavily depend on intricate simulation engines, hampering feasibility and scalability. This paper introduces BehAVE, a video understanding framework that utilises existing commercial video games for domain randomisation without accessing their simulation engines. BehAVE taps into the visual diversity of video games for randomisation and uses textual descriptions of player actions to align videos with similar content. We evaluate BehAVE across 25 first-person shooter (FPS) games using various video and text foundation models, demonstrating its robustness in domain randomisation. BehAVE effectively aligns player behavioural patterns and achieves zero-shot transfer to multiple unseen FPS games when trained on just one game. In a more challenging scenario, BehAVE enhances the zero-shot transferability of foundation models to unseen FPS games, even when trained on a game of a different genre, with improvements of up to 22%. BehAVE is available online at https://github.com/nrasajski/BehAVE. |
| title | BehAVE: Behaviour Alignment of Video Game Encodings |
| topic | Computer Vision and Pattern Recognition Artificial Intelligence |
| url | https://arxiv.org/abs/2402.01335 |