:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Tingting, Wang, Tianrui, Ge, Meng, Zhang, Qiquan, Ge, Zirui, Yang, Zhen
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2412.16823
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learning Time-Graph Frequency Representation for Monaural Speech Enhancement
by: Wang, Tingting, et al.
Published: (2025)

Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation
by: Ge, Zirui, et al.
Published: (2023)

LORT: Locally Refined Convolution and Taylor Transformer for Monaural Speech Enhancement
by: Wang, Junyu, et al.
Published: (2025)

An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2024)

Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement
by: Wang, Junyu, et al.
Published: (2024)

Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module
by: Cui, Zhongjian, et al.
Published: (2025)

Selective State Space Model for Monaural Speech Enhancement
by: Chen, Moran, et al.
Published: (2024)

An Exploration of Length Generalization in Transformer-Based Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2024)

Exploring Length Generalization For Transformer-based Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2025)

Progressive Residual Extraction based Pre-training for Speech Representation Learning
by: Wang, Tianrui, et al.
Published: (2024)

Long-Context Modeling Networks for Monaural Speech Enhancement: A Comparative Study
by: Zhang, Qiquan, et al.
Published: (2025)

SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information
by: Zhang, Xiangyu, et al.
Published: (2025)

ASDA: Audio Spectrogram Differential Attention Mechanism for Self-Supervised Representation Learning
by: Wang, Junyu, et al.
Published: (2025)

Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis
by: Wang, Tianrui, et al.
Published: (2025)

FakeMark: Deepfake Speech Attribution With Watermarked Artifacts
by: Ge, Wanying, et al.
Published: (2025)

FRCRN: Boosting Feature Representation using Frequency Recurrence for Monaural Speech Enhancement
by: Zhao, Shengkui, et al.
Published: (2022)

Towards Data Drift Monitoring for Speech Deepfake Detection in the context of MLOps
by: Wang, Xin, et al.
Published: (2025)

Post-training for Deepfake Speech Detection
by: Ge, Wanying, et al.
Published: (2025)

Does Fine-tuning by Reinforcement Learning Improve Generalization in Binary Speech Deepfake Detection?
by: Wang, Xin, et al.
Published: (2026)

Code-switching Speech Recognition Under the Lens: Model- and Data-Centric Perspectives
by: Liu, Hexin, et al.
Published: (2025)

Hybrid Real- And Complex-Valued Neural Network Concept For Low-Complexity Phase-Aware Speech Enhancement
by: Fiorio, Luan Vinícius, et al.
Published: (2025)

MeanSE: Efficient Generative Speech Enhancement with Mean Flows
by: Wang, Jiahe, et al.
Published: (2025)

Test-Time Adaptation For Speech Enhancement Via Mask Polarization
by: Raichle, Tobias, et al.
Published: (2026)

Leveraging Local and Global Knowledge Integration with Time-Frequency Calibrated Distillation for Speech Enhancement
by: Cheng, Jiaming, et al.
Published: (2025)

SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech
by: Lin, Jingru, et al.
Published: (2024)

Dynamic Frequency-Adaptive Knowledge Distillation for Speech Enhancement
by: Yuan, Xihao, et al.
Published: (2025)

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
by: Lu, Ye-Xin, et al.
Published: (2023)

Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network
by: Chen, Yanan, et al.
Published: (2024)

CFMDCTCodec: A Low-Bitrate Neural Speech Codec with Noise-Prior-aware Conditional Flow Matching for MDCT-Spectral Enhancement
by: Jiang, Xiao-Hang, et al.
Published: (2026)

Sparsity-Driven EEG Channel Selection for Brain-Assisted Speech Enhancement
by: Zhang, Jie, et al.
Published: (2023)

Investigation of Speech and Noise Latent Representations in Single-channel VAE-based Speech Enhancement
by: Li, Jiatong, et al.
Published: (2025)

Evaluating the Expressive Appropriateness of Speech in Rich Contexts
by: Wang, Tianrui, et al.
Published: (2026)

Test-Time Adaptation for Speech Enhancement via Domain Invariant Embedding Transformation
by: Raichle, Tobias, et al.
Published: (2025)

ARiSE: Auto-Regressive Multi-Channel Speech Enhancement
by: Shen, Pengjie, et al.
Published: (2025)

Advancing Electrolaryngeal Speech Enhancement Through Speech-Text Representation Learning
by: Ma, Ding, et al.
Published: (2026)

Leveraging Discriminative Latent Representations for Conditioning GAN-Based Speech Enhancement
by: Shetu, Shrishti Saha, et al.
Published: (2025)

Predictive-Generative Drift Decomposition for Speech Enhancement and Separation
by: Richter, Julius, et al.
Published: (2026)

Towards Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training
by: Yang, Yifan, et al.
Published: (2026)

FastEnhancer: Speed-Optimized Streaming Neural Speech Enhancement
by: Ahn, Sunghwan, et al.
Published: (2025)

Binaural Selective Attention Model for Target Speaker Extraction
by: Meng, Hanyu, et al.
Published: (2024)