Saved in:
Bibliographic Details
Main Authors: Liang, Yayun, Zhang, Yuanming, Chen, Fei, Lu, Jing, Lin, Zhibin
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.20542
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914593133035520
author Liang, Yayun
Zhang, Yuanming
Chen, Fei
Lu, Jing
Lin, Zhibin
author_facet Liang, Yayun
Zhang, Yuanming
Chen, Fei
Lu, Jing
Lin, Zhibin
contents Recent advances in reconstructing speech envelopes from Electroencephalogram (EEG) signals have enabled continuous auditory attention decoding (AAD) in multi-speaker environments. Most Deep Neural Network (DNN)-based envelope reconstruction models are trained to maximize the Pearson correlation coefficients (PCC) between the attended envelope and the reconstructed envelope (attended PCC). While the difference between the attended PCC and the unattended PCC plays an essential role in auditory attention decoding, existing methods often focus on maximizing the attended PCC. We therefore propose a contrastive PCC loss which represents the difference between the attended PCC and the unattended PCC. The proposed approach is evaluated on three public EEG AAD datasets using four DNN architectures. Across many settings, the proposed objective improves envelope separability and AAD accuracy, while also revealing dataset- and architecture-dependent failure cases.
format Preprint
id arxiv_https___arxiv_org_abs_2601_20542
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Decoding Speech Envelopes from Electroencephalogram with a Contrastive Pearson Correlation Coefficient Loss
Liang, Yayun
Zhang, Yuanming
Chen, Fei
Lu, Jing
Lin, Zhibin
Audio and Speech Processing
Recent advances in reconstructing speech envelopes from Electroencephalogram (EEG) signals have enabled continuous auditory attention decoding (AAD) in multi-speaker environments. Most Deep Neural Network (DNN)-based envelope reconstruction models are trained to maximize the Pearson correlation coefficients (PCC) between the attended envelope and the reconstructed envelope (attended PCC). While the difference between the attended PCC and the unattended PCC plays an essential role in auditory attention decoding, existing methods often focus on maximizing the attended PCC. We therefore propose a contrastive PCC loss which represents the difference between the attended PCC and the unattended PCC. The proposed approach is evaluated on three public EEG AAD datasets using four DNN architectures. Across many settings, the proposed objective improves envelope separability and AAD accuracy, while also revealing dataset- and architecture-dependent failure cases.
title Decoding Speech Envelopes from Electroencephalogram with a Contrastive Pearson Correlation Coefficient Loss
topic Audio and Speech Processing
url https://arxiv.org/abs/2601.20542