Saved in:
Bibliographic Details
Main Authors: Ochin, Jeremie, Chekroun, Raphael, Stanciulescu, Bogdan, Manitsaris, Sotiris
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2511.16183
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917095763083264
author Ochin, Jeremie
Chekroun, Raphael
Stanciulescu, Bogdan
Manitsaris, Sotiris
author_facet Ochin, Jeremie
Chekroun, Raphael
Stanciulescu, Bogdan
Manitsaris, Sotiris
contents Soccer video understanding has motivated the creation of datasets for tasks such as temporal action localization, spatiotemporal action detection (STAD), or multiobject tracking (MOT). The annotation of structured sequences of events (who does what, when, and where) used for soccer analytics requires a holistic approach that integrates both STAD and MOT. However, current action recognition methods remain insufficient for constructing reliable play-by-play data and are typically used to assist rather than fully automate annotation. Parallel research has advanced tactical modeling, trajectory forecasting, and performance analysis, all grounded in game-state and play-by-play data. This motivates leveraging tactical knowledge as a prior to support computer-vision-based predictions, enabling more automated and reliable extraction of play-by-play data. We introduce Footovision Play-by-Play Action Spotting in Soccer Dataset (FOOTPASS), the first benchmark for play-by-play action spotting over entire soccer matches in a multi-modal, multi-agent tactical context. It enables the development of methods for player-centric action spotting that exploit both outputs from computer-vision tasks (e.g., tracking, identification) and prior knowledge of soccer, including its tactical regularities over long time horizons, to generate reliable play-by-play data streams. These streams form an essential input for data-driven sports analytics.
format Preprint
id arxiv_https___arxiv_org_abs_2511_16183
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle FOOTPASS: A Multi-Modal Multi-Agent Tactical Context Dataset for Play-by-Play Action Spotting in Soccer Broadcast Videos
Ochin, Jeremie
Chekroun, Raphael
Stanciulescu, Bogdan
Manitsaris, Sotiris
Artificial Intelligence
Computer Vision and Pattern Recognition
Soccer video understanding has motivated the creation of datasets for tasks such as temporal action localization, spatiotemporal action detection (STAD), or multiobject tracking (MOT). The annotation of structured sequences of events (who does what, when, and where) used for soccer analytics requires a holistic approach that integrates both STAD and MOT. However, current action recognition methods remain insufficient for constructing reliable play-by-play data and are typically used to assist rather than fully automate annotation. Parallel research has advanced tactical modeling, trajectory forecasting, and performance analysis, all grounded in game-state and play-by-play data. This motivates leveraging tactical knowledge as a prior to support computer-vision-based predictions, enabling more automated and reliable extraction of play-by-play data. We introduce Footovision Play-by-Play Action Spotting in Soccer Dataset (FOOTPASS), the first benchmark for play-by-play action spotting over entire soccer matches in a multi-modal, multi-agent tactical context. It enables the development of methods for player-centric action spotting that exploit both outputs from computer-vision tasks (e.g., tracking, identification) and prior knowledge of soccer, including its tactical regularities over long time horizons, to generate reliable play-by-play data streams. These streams form an essential input for data-driven sports analytics.
title FOOTPASS: A Multi-Modal Multi-Agent Tactical Context Dataset for Play-by-Play Action Spotting in Soccer Broadcast Videos
topic Artificial Intelligence
Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2511.16183