:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Syu-Siang, Chen, Jia-Yang, Bai, Bo-Ren, Fang, Shih-Hau, Tsao, Yu
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing Signal Processing
Online Access:	https://arxiv.org/abs/2407.01939
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Study on Speech Assessment with Visual Cues
by: Ahmed, Shafique, et al.
Published: (2025)

Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
by: Wang, Kuan-Chen, et al.
Published: (2024)

SELM: Speech Enhancement Using Discrete Tokens and Language Models
by: Wang, Ziqian, et al.
Published: (2023)

Binaural Speech Enhancement Using Complex Convolutional Recurrent Networks
by: Tokala, Vikas, et al.
Published: (2025)

TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations
by: Gao, Xiaoxue, et al.
Published: (2024)

Speech Enhancement based on cascaded two flows
by: Lee, Seonggyu, et al.
Published: (2025)

USDnet: Unsupervised Speech Dereverberation via Neural Forward Filtering
by: Wang, Zhong-Qiu
Published: (2024)

FlowSE: Flow Matching-based Speech Enhancement
by: Lee, Seonggyu, et al.
Published: (2025)

GAN-Based Speech Enhancement for Low SNR Using Latent Feature Conditioning
by: Shetu, Shrishti Saha, et al.
Published: (2024)

DeFTAN-II: Efficient Multichannel Speech Enhancement with Subgroup Processing
by: Lee, Dongheon, et al.
Published: (2023)

FlowSE: Efficient and High-Quality Speech Enhancement via Flow Matching
by: Wang, Ziqian, et al.
Published: (2025)

HyBeam: Hybrid Microphone-Beamforming Array-Agnostic Speech Enhancement for Wearables
by: Ilan, Yuval Bar, et al.
Published: (2025)

RRP-Voice: A Longitudinal Dataset and Benchmark for Recurrent Respiratory Papillomatosis Detection
by: Ren, Wenze, et al.
Published: (2026)

Toward Universal Speech Enhancement for Diverse Input Conditions
by: Zhang, Wangyou, et al.
Published: (2023)

Voice Mapping of Text-to-Speech Systems: A Metric-Based Approach for Voice Quality Assessment
by: Cai, Huanchen, et al.
Published: (2026)

Microphone Array Signal Processing and Deep Learning for Speech Enhancement
by: Haeb-Umbach, Reinhold, et al.
Published: (2025)

Lessons Learned from the URGENT 2024 Speech Enhancement Challenge
by: Zhang, Wangyou, et al.
Published: (2025)

Generic Speech Enhancement with Self-Supervised Representation Space Loss
by: Sato, Hiroshi, et al.
Published: (2025)

Contrastive Knowledge Distillation for Embedding Refinement in Personalized Speech Enhancement
by: Serre, Thomas, et al.
Published: (2026)

SuperM2M: Supervised and Mixture-to-Mixture Co-Learning for Speech Enhancement and Noise-Robust ASR
by: Wang, Zhong-Qiu
Published: (2024)

PLDNet: PLD-Guided Lightweight Deep Network Boosted by Efficient Attention for Handheld Dual-Microphone Speech Enhancement
by: Zhou, Nan, et al.
Published: (2024)

Advanced Signal Analysis in Detecting Replay Attacks for Automatic Speaker Verification Systems
by: Kuang, Lee Shih
Published: (2024)

FlexIO: Flexible Single- and Multi-Channel Speech Separation and Enhancement
by: Masuyama, Yoshiki, et al.
Published: (2025)

Unsupervised Variational Acoustic Clustering
by: Fiorio, Luan Vinícius, et al.
Published: (2025)

Mel-McNet: A Mel-Scale Framework for Online Multichannel Speech Enhancement
by: Yang, Yujie, et al.
Published: (2025)

Self-supervised Multimodal Speech Representations for the Assessment of Schizophrenia Symptoms
by: Premananth, Gowtham, et al.
Published: (2024)

Entropy-Guided GRVQ for Ultra-Low Bitrate Neural Speech Codec
by: Ren, Yanzhou, et al.
Published: (2026)

SpeechMLC: Speech Multi-label Classification
by: Kim, Miseul, et al.
Published: (2025)

LiSenNet: Lightweight Sub-band and Dual-Path Modeling for Real-Time Speech Enhancement
by: Yan, Haoyin, et al.
Published: (2024)

Speech Enhancement Based on Drifting Models
by: Xu, Liang, et al.
Published: (2026)

A Speech Production Model for Radar: Connecting Speech Acoustics with Radar-Measured Vibrations
by: Lenz, Isabella, et al.
Published: (2025)

Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds
by: Bae, Hanbin, et al.
Published: (2024)

ParaS2S: Benchmarking and Aligning Spoken Language Models for Paralinguistic-aware Speech-to-Speech Interaction
by: Yang, Shu-wen, et al.
Published: (2025)

Unsupervised detection and classification of heartbeats using the dissimilarity matrix in PCG signals
by: Torre-Cruz, J., et al.
Published: (2024)

FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement
by: Hao, Xiang, et al.
Published: (2020)

Binaural Localization Model for Speech in Noise
by: Tokala, Vikas, et al.
Published: (2025)

Speech-Based Prioritization for Schizophrenia Intervention
by: Premananth, Gowtham, et al.
Published: (2025)

Prompt-driven Target Speech Diarization
by: Jiang, Yidi, et al.
Published: (2023)

Brain-Informed Speech Separation for Cochlear Implants
by: Gajecki, Tom, et al.
Published: (2026)

FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses
by: Xu, Zhongweiyang, et al.
Published: (2024)