MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Pallenberg, René, Katzberg, Fabrice, Mertins, Alfred, Maass, Marco
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Signal Processing Artificial Intelligence Audio and Speech Processing
Accesso online:	https://arxiv.org/abs/2602.23003
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866918358095495168
author	Pallenberg, René Katzberg, Fabrice Mertins, Alfred Maass, Marco
author_facet	Pallenberg, René Katzberg, Fabrice Mertins, Alfred Maass, Marco
contents	The use of hearing aids will increase in the coming years due to demographic change. One open problem that remains to be solved by a new generation of hearing aids is the cocktail party problem. A possible solution is electroencephalography-based auditory attention decoding. This has been the subject of several studies in recent years, which have in common that they use the same preprocessing methods in most cases. In this work, in order to achieve an advantage, the use of a scattering transform is proposed as an alternative to these preprocessing methods. The two-layer scattering transform is compared with a regular filterbank, the synchrosqueezing short-time Fourier transform and the common preprocessing. To demonstrate the performance, the known and the proposed preprocessing methods are compared for different classification tasks on two widely used datasets, provided by the KU Leuven (KUL) and the Technical University of Denmark (DTU). Both established and new neural-network-based models, CNNs, LSTMs, and recent Transformer/graph-based models are used for classification. Various evaluation strategies were compared, with a focus on the task of classifying speakers who are unknown from the training. We show that the two-layer scattering transform can significantly improve the performance for subject-related conditions, especially on the KUL dataset. However, on the DTU dataset, this only applies to some of the models, or when larger amounts of training data are provided, as in 10-fold cross-validation. This suggests that the scattering transform is capable of extracting additional relevant information.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_23003
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Scattering Transform for Auditory Attention Decoding Pallenberg, René Katzberg, Fabrice Mertins, Alfred Maass, Marco Signal Processing Artificial Intelligence Audio and Speech Processing The use of hearing aids will increase in the coming years due to demographic change. One open problem that remains to be solved by a new generation of hearing aids is the cocktail party problem. A possible solution is electroencephalography-based auditory attention decoding. This has been the subject of several studies in recent years, which have in common that they use the same preprocessing methods in most cases. In this work, in order to achieve an advantage, the use of a scattering transform is proposed as an alternative to these preprocessing methods. The two-layer scattering transform is compared with a regular filterbank, the synchrosqueezing short-time Fourier transform and the common preprocessing. To demonstrate the performance, the known and the proposed preprocessing methods are compared for different classification tasks on two widely used datasets, provided by the KU Leuven (KUL) and the Technical University of Denmark (DTU). Both established and new neural-network-based models, CNNs, LSTMs, and recent Transformer/graph-based models are used for classification. Various evaluation strategies were compared, with a focus on the task of classifying speakers who are unknown from the training. We show that the two-layer scattering transform can significantly improve the performance for subject-related conditions, especially on the KUL dataset. However, on the DTU dataset, this only applies to some of the models, or when larger amounts of training data are provided, as in 10-fold cross-validation. This suggests that the scattering transform is capable of extracting additional relevant information.
title	Scattering Transform for Auditory Attention Decoding
topic	Signal Processing Artificial Intelligence Audio and Speech Processing
url	https://arxiv.org/abs/2602.23003

Documenti analoghi