Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yu, Guochen, Han, Runqiang, Xu, Chenglin, Zhao, Haoran, Li, Nan, Zhang, Chen, Zheng, Xiguang, Zhou, Chao, Huang, Qi, Yu, Bing
Format:	Preprint
Published:	2024
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2402.01808
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910317664010240
author	Yu, Guochen Han, Runqiang Xu, Chenglin Zhao, Haoran Li, Nan Zhang, Chen Zheng, Xiguang Zhou, Chao Huang, Qi Yu, Bing
author_facet	Yu, Guochen Han, Runqiang Xu, Chenglin Zhao, Haoran Li, Nan Zhang, Chen Zheng, Xiguang Zhou, Chao Huang, Qi Yu, Bing
contents	This paper presents the speech restoration and enhancement system created by the 1024K team for the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. Our system consists of a generative adversarial network (GAN) in complex-domain for speech restoration and a fine-grained multi-band fusion module for speech enhancement. In the blind test set of SSI, the proposed system achieves an overall mean opinion score (MOS) of 3.49 based on ITU-T P.804 and a Word Accuracy Rate (WAcc) of 0.78 for the real-time track, as well as an overall P.804 MOS of 3.43 and a WAcc of 0.78 for the non-real-time track, ranking 1st in both tracks.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_01808
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	KS-Net: Multi-band joint speech restoration and enhancement network for 2024 ICASSP SSI Challenge Yu, Guochen Han, Runqiang Xu, Chenglin Zhao, Haoran Li, Nan Zhang, Chen Zheng, Xiguang Zhou, Chao Huang, Qi Yu, Bing Sound Audio and Speech Processing This paper presents the speech restoration and enhancement system created by the 1024K team for the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. Our system consists of a generative adversarial network (GAN) in complex-domain for speech restoration and a fine-grained multi-band fusion module for speech enhancement. In the blind test set of SSI, the proposed system achieves an overall mean opinion score (MOS) of 3.49 based on ITU-T P.804 and a Word Accuracy Rate (WAcc) of 0.78 for the real-time track, as well as an overall P.804 MOS of 3.43 and a WAcc of 0.78 for the non-real-time track, ranking 1st in both tracks.
title	KS-Net: Multi-band joint speech restoration and enhancement network for 2024 ICASSP SSI Challenge
topic	Sound Audio and Speech Processing
url	https://arxiv.org/abs/2402.01808

Similar Items