Saved in:
Bibliographic Details
Main Authors: Mignot, Rémi, Peeters, Geoffroy
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2403.00688
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913250534227968
author Mignot, Rémi
Peeters, Geoffroy
author_facet Mignot, Rémi
Peeters, Geoffroy
contents For music indexing robust to sound degradations and scalable for big music catalogs, this scientific report presents an approach based on audio descriptors relevant to the music content and invariant to sound transformations (noise addition, distortion, lossy coding, pitch/time transformations, or filtering e.g.). To achieve this task, one of the key point of the proposed method is the definition of high-dimensional audio prints, which are intrinsically (by design) robust to some sound degradations. The high dimensionality of this first representation is then used to learn a linear projection to a sub-space significantly smaller, which reduces again the sensibility to sound degradations using a series of discriminant analyses. Finally, anchoring the analysis times on local maxima of a selected onset function, an approximative hashing is done to provide a better tolerance to bit corruptions, and in the same time to make easier the scaling of the method.
format Preprint
id arxiv_https___arxiv_org_abs_2403_00688
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Degradation-Invariant Music Indexing
Mignot, Rémi
Peeters, Geoffroy
Signal Processing
For music indexing robust to sound degradations and scalable for big music catalogs, this scientific report presents an approach based on audio descriptors relevant to the music content and invariant to sound transformations (noise addition, distortion, lossy coding, pitch/time transformations, or filtering e.g.). To achieve this task, one of the key point of the proposed method is the definition of high-dimensional audio prints, which are intrinsically (by design) robust to some sound degradations. The high dimensionality of this first representation is then used to learn a linear projection to a sub-space significantly smaller, which reduces again the sensibility to sound degradations using a series of discriminant analyses. Finally, anchoring the analysis times on local maxima of a selected onset function, an approximative hashing is done to provide a better tolerance to bit corruptions, and in the same time to make easier the scaling of the method.
title Degradation-Invariant Music Indexing
topic Signal Processing
url https://arxiv.org/abs/2403.00688