Saved in:
Bibliographic Details
Main Authors: Kharlamova, Arina, He, Bowei, Ma, Chen, Liu, Xue
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.00927
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914437421596672
author Kharlamova, Arina
He, Bowei
Ma, Chen
Liu, Xue
author_facet Kharlamova, Arina
He, Bowei
Ma, Chen
Liu, Xue
contents We present DANCEMATCH, an end-to-end framework for motion-based dance retrieval, the task of identifying semantically similar choreographies directly from raw video, defined as DANCE FINGERPRINTING. While existing motion analysis and retrieval methods can compare pose sequences, they rely on continuous embeddings that are difficult to index, interpret, or scale. In contrast, DANCEMATCH constructs compact, discrete motion signatures that capture the spatio-temporal structure of dance while enabling efficient large-scale retrieval. Our system integrates Skeleton Motion Quantisation (SMQ) with Spatio-Temporal Transformers (STT) to encode human poses, extracted via Apple CoMotion, into a structured motion vocabulary. We further design DANCE RETRIEVAL ENGINE (DRE), which performs sub-linear retrieval using a histogram-based index followed by re-ranking for refined matching. To facilitate reproducible research, we release DANCETYPESBENCHMARK, a pose-aligned dataset annotated with quantised motion tokens. Experiments demonstrate robust retrieval across diverse dance styles and strong generalisation to unseen choreographies, establishing a foundation for scalable motion fingerprinting and quantitative choreographic analysis.
format Preprint
id arxiv_https___arxiv_org_abs_2604_00927
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Learning Quantised Structure-Preserving Motion Representations for Dance Fingerprinting
Kharlamova, Arina
He, Bowei
Ma, Chen
Liu, Xue
Computer Vision and Pattern Recognition
Artificial Intelligence
We present DANCEMATCH, an end-to-end framework for motion-based dance retrieval, the task of identifying semantically similar choreographies directly from raw video, defined as DANCE FINGERPRINTING. While existing motion analysis and retrieval methods can compare pose sequences, they rely on continuous embeddings that are difficult to index, interpret, or scale. In contrast, DANCEMATCH constructs compact, discrete motion signatures that capture the spatio-temporal structure of dance while enabling efficient large-scale retrieval. Our system integrates Skeleton Motion Quantisation (SMQ) with Spatio-Temporal Transformers (STT) to encode human poses, extracted via Apple CoMotion, into a structured motion vocabulary. We further design DANCE RETRIEVAL ENGINE (DRE), which performs sub-linear retrieval using a histogram-based index followed by re-ranking for refined matching. To facilitate reproducible research, we release DANCETYPESBENCHMARK, a pose-aligned dataset annotated with quantised motion tokens. Experiments demonstrate robust retrieval across diverse dance styles and strong generalisation to unseen choreographies, establishing a foundation for scalable motion fingerprinting and quantitative choreographic analysis.
title Learning Quantised Structure-Preserving Motion Representations for Dance Fingerprinting
topic Computer Vision and Pattern Recognition
Artificial Intelligence
url https://arxiv.org/abs/2604.00927