Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Mo, Kaien, Wang, Xianrui, Yang, Yichen, Makino, Shoji, Chen, Jingdong
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2406.09821
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913391198601216
author	Mo, Kaien Wang, Xianrui Yang, Yichen Makino, Shoji Chen, Jingdong
author_facet	Mo, Kaien Wang, Xianrui Yang, Yichen Makino, Shoji Chen, Jingdong
contents	Blind-audio-source-separation (BASS) techniques, particularly those with low latency, play an important role in a wide range of real-time systems, e.g., hearing aids, in-car hand-free voice communication, real-time human-machine interaction, etc. Most existing BASS algorithms are deduced to run on batch mode, and therefore large latency is unavoidable. Recently, some online algorithms were developed, which achieve separation on a frame-by-frame basis in the short-time-Fourier-transform (STFT) domain and the latency is significantly reduced as compared to those batch methods. However, the latency with these algorithms may still be too long for many real-time systems to bear. To further reduce latency while achieving good separation performance, we propose in this work to integrate a weighted prediction error (WPE) module into a non-causal sample-truncating-based independent vector analysis (NST-IVA). The resulting algorithm can maintain the algorithmic delay as NST-IVA if the delay with WPE is appropriately controlled while achieving significantly better performance, which is validated by simulations.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_09821
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Low algorithmic delay implementation of convolutional beamformer for online joint source separation and dereverberation Mo, Kaien Wang, Xianrui Yang, Yichen Makino, Shoji Chen, Jingdong Audio and Speech Processing Blind-audio-source-separation (BASS) techniques, particularly those with low latency, play an important role in a wide range of real-time systems, e.g., hearing aids, in-car hand-free voice communication, real-time human-machine interaction, etc. Most existing BASS algorithms are deduced to run on batch mode, and therefore large latency is unavoidable. Recently, some online algorithms were developed, which achieve separation on a frame-by-frame basis in the short-time-Fourier-transform (STFT) domain and the latency is significantly reduced as compared to those batch methods. However, the latency with these algorithms may still be too long for many real-time systems to bear. To further reduce latency while achieving good separation performance, we propose in this work to integrate a weighted prediction error (WPE) module into a non-causal sample-truncating-based independent vector analysis (NST-IVA). The resulting algorithm can maintain the algorithmic delay as NST-IVA if the delay with WPE is appropriately controlled while achieving significantly better performance, which is validated by simulations.
title	Low algorithmic delay implementation of convolutional beamformer for online joint source separation and dereverberation
topic	Audio and Speech Processing
url	https://arxiv.org/abs/2406.09821

Similar Items