Saved in:
Bibliographic Details
Main Authors: Rašajski, Nemanja, Trivedi, Chintan, Makantasis, Konstantinos, Liapis, Antonios, Yannakakis, Georgios N.
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.01335
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917824985825280
author Rašajski, Nemanja
Trivedi, Chintan
Makantasis, Konstantinos
Liapis, Antonios
Yannakakis, Georgios N.
author_facet Rašajski, Nemanja
Trivedi, Chintan
Makantasis, Konstantinos
Liapis, Antonios
Yannakakis, Georgios N.
contents Domain randomisation enhances the transferability of vision models across visually distinct domains with similar content. However, current methods heavily depend on intricate simulation engines, hampering feasibility and scalability. This paper introduces BehAVE, a video understanding framework that utilises existing commercial video games for domain randomisation without accessing their simulation engines. BehAVE taps into the visual diversity of video games for randomisation and uses textual descriptions of player actions to align videos with similar content. We evaluate BehAVE across 25 first-person shooter (FPS) games using various video and text foundation models, demonstrating its robustness in domain randomisation. BehAVE effectively aligns player behavioural patterns and achieves zero-shot transfer to multiple unseen FPS games when trained on just one game. In a more challenging scenario, BehAVE enhances the zero-shot transferability of foundation models to unseen FPS games, even when trained on a game of a different genre, with improvements of up to 22%. BehAVE is available online at https://github.com/nrasajski/BehAVE.
format Preprint
id arxiv_https___arxiv_org_abs_2402_01335
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle BehAVE: Behaviour Alignment of Video Game Encodings
Rašajski, Nemanja
Trivedi, Chintan
Makantasis, Konstantinos
Liapis, Antonios
Yannakakis, Georgios N.
Computer Vision and Pattern Recognition
Artificial Intelligence
Domain randomisation enhances the transferability of vision models across visually distinct domains with similar content. However, current methods heavily depend on intricate simulation engines, hampering feasibility and scalability. This paper introduces BehAVE, a video understanding framework that utilises existing commercial video games for domain randomisation without accessing their simulation engines. BehAVE taps into the visual diversity of video games for randomisation and uses textual descriptions of player actions to align videos with similar content. We evaluate BehAVE across 25 first-person shooter (FPS) games using various video and text foundation models, demonstrating its robustness in domain randomisation. BehAVE effectively aligns player behavioural patterns and achieves zero-shot transfer to multiple unseen FPS games when trained on just one game. In a more challenging scenario, BehAVE enhances the zero-shot transferability of foundation models to unseen FPS games, even when trained on a game of a different genre, with improvements of up to 22%. BehAVE is available online at https://github.com/nrasajski/BehAVE.
title BehAVE: Behaviour Alignment of Video Game Encodings
topic Computer Vision and Pattern Recognition
Artificial Intelligence
url https://arxiv.org/abs/2402.01335