Saved in:
Bibliographic Details
Main Authors: Mo, Kaien, Wang, Xianrui, Yang, Yichen, Makino, Shoji, Chen, Jingdong
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.09821
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913391198601216
author Mo, Kaien
Wang, Xianrui
Yang, Yichen
Makino, Shoji
Chen, Jingdong
author_facet Mo, Kaien
Wang, Xianrui
Yang, Yichen
Makino, Shoji
Chen, Jingdong
contents Blind-audio-source-separation (BASS) techniques, particularly those with low latency, play an important role in a wide range of real-time systems, e.g., hearing aids, in-car hand-free voice communication, real-time human-machine interaction, etc. Most existing BASS algorithms are deduced to run on batch mode, and therefore large latency is unavoidable. Recently, some online algorithms were developed, which achieve separation on a frame-by-frame basis in the short-time-Fourier-transform (STFT) domain and the latency is significantly reduced as compared to those batch methods. However, the latency with these algorithms may still be too long for many real-time systems to bear. To further reduce latency while achieving good separation performance, we propose in this work to integrate a weighted prediction error (WPE) module into a non-causal sample-truncating-based independent vector analysis (NST-IVA). The resulting algorithm can maintain the algorithmic delay as NST-IVA if the delay with WPE is appropriately controlled while achieving significantly better performance, which is validated by simulations.
format Preprint
id arxiv_https___arxiv_org_abs_2406_09821
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Low algorithmic delay implementation of convolutional beamformer for online joint source separation and dereverberation
Mo, Kaien
Wang, Xianrui
Yang, Yichen
Makino, Shoji
Chen, Jingdong
Audio and Speech Processing
Blind-audio-source-separation (BASS) techniques, particularly those with low latency, play an important role in a wide range of real-time systems, e.g., hearing aids, in-car hand-free voice communication, real-time human-machine interaction, etc. Most existing BASS algorithms are deduced to run on batch mode, and therefore large latency is unavoidable. Recently, some online algorithms were developed, which achieve separation on a frame-by-frame basis in the short-time-Fourier-transform (STFT) domain and the latency is significantly reduced as compared to those batch methods. However, the latency with these algorithms may still be too long for many real-time systems to bear. To further reduce latency while achieving good separation performance, we propose in this work to integrate a weighted prediction error (WPE) module into a non-causal sample-truncating-based independent vector analysis (NST-IVA). The resulting algorithm can maintain the algorithmic delay as NST-IVA if the delay with WPE is appropriately controlled while achieving significantly better performance, which is validated by simulations.
title Low algorithmic delay implementation of convolutional beamformer for online joint source separation and dereverberation
topic Audio and Speech Processing
url https://arxiv.org/abs/2406.09821