Saved in:
Bibliographic Details
Main Authors: Thilakarathne, Haritha, Nibali, Aiden, He, Zhen, Morgan, Stuart
Format: Preprint
Published: 2021
Subjects:
Online Access:https://arxiv.org/abs/2108.04186
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909063161315328
author Thilakarathne, Haritha
Nibali, Aiden
He, Zhen
Morgan, Stuart
author_facet Thilakarathne, Haritha
Nibali, Aiden
He, Zhen
Morgan, Stuart
contents We introduce a novel deep learning based group activity recognition approach called the Pose Only Group Activity Recognition System (POGARS), designed to use only tracked poses of people to predict the performed group activity. In contrast to existing approaches for group activity recognition, POGARS uses 1D CNNs to learn spatiotemporal dynamics of individuals involved in a group activity and forgo learning features from pixel data. The proposed model uses a spatial and temporal attention mechanism to infer person-wise importance and multi-task learning for simultaneously performing group and individual action classification. Experimental results confirm that POGARS achieves highly competitive results compared to state-of-the-art methods on a widely used public volleyball dataset despite only using tracked pose as input. Further our experiments show by using pose only as input, POGARS has better generalization capabilities compared to methods that use RGB as input.
format Preprint
id arxiv_https___arxiv_org_abs_2108_04186
institution arXiv
publishDate 2021
record_format arxiv
spellingShingle Pose is all you need: The pose only group activity recognition system (POGARS)
Thilakarathne, Haritha
Nibali, Aiden
He, Zhen
Morgan, Stuart
Computer Vision and Pattern Recognition
We introduce a novel deep learning based group activity recognition approach called the Pose Only Group Activity Recognition System (POGARS), designed to use only tracked poses of people to predict the performed group activity. In contrast to existing approaches for group activity recognition, POGARS uses 1D CNNs to learn spatiotemporal dynamics of individuals involved in a group activity and forgo learning features from pixel data. The proposed model uses a spatial and temporal attention mechanism to infer person-wise importance and multi-task learning for simultaneously performing group and individual action classification. Experimental results confirm that POGARS achieves highly competitive results compared to state-of-the-art methods on a widely used public volleyball dataset despite only using tracked pose as input. Further our experiments show by using pose only as input, POGARS has better generalization capabilities compared to methods that use RGB as input.
title Pose is all you need: The pose only group activity recognition system (POGARS)
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2108.04186