Saved in:
| Main Authors: | Wang, Tingting, Wang, Tianrui, Ge, Meng, Zhang, Qiquan, Ge, Zirui, Yang, Zhen |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.16823 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning Time-Graph Frequency Representation for Monaural Speech Enhancement
by: Wang, Tingting, et al.
Published: (2025)
by: Wang, Tingting, et al.
Published: (2025)
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation
by: Ge, Zirui, et al.
Published: (2023)
by: Ge, Zirui, et al.
Published: (2023)
LORT: Locally Refined Convolution and Taylor Transformer for Monaural Speech Enhancement
by: Wang, Junyu, et al.
Published: (2025)
by: Wang, Junyu, et al.
Published: (2025)
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2024)
by: Zhang, Qiquan, et al.
Published: (2024)
Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement
by: Wang, Junyu, et al.
Published: (2024)
by: Wang, Junyu, et al.
Published: (2024)
Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module
by: Cui, Zhongjian, et al.
Published: (2025)
by: Cui, Zhongjian, et al.
Published: (2025)
Selective State Space Model for Monaural Speech Enhancement
by: Chen, Moran, et al.
Published: (2024)
by: Chen, Moran, et al.
Published: (2024)
An Exploration of Length Generalization in Transformer-Based Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2024)
by: Zhang, Qiquan, et al.
Published: (2024)
Exploring Length Generalization For Transformer-based Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2025)
by: Zhang, Qiquan, et al.
Published: (2025)
Progressive Residual Extraction based Pre-training for Speech Representation Learning
by: Wang, Tianrui, et al.
Published: (2024)
by: Wang, Tianrui, et al.
Published: (2024)
Long-Context Modeling Networks for Monaural Speech Enhancement: A Comparative Study
by: Zhang, Qiquan, et al.
Published: (2025)
by: Zhang, Qiquan, et al.
Published: (2025)
SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information
by: Zhang, Xiangyu, et al.
Published: (2025)
by: Zhang, Xiangyu, et al.
Published: (2025)
ASDA: Audio Spectrogram Differential Attention Mechanism for Self-Supervised Representation Learning
by: Wang, Junyu, et al.
Published: (2025)
by: Wang, Junyu, et al.
Published: (2025)
Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis
by: Wang, Tianrui, et al.
Published: (2025)
by: Wang, Tianrui, et al.
Published: (2025)
FakeMark: Deepfake Speech Attribution With Watermarked Artifacts
by: Ge, Wanying, et al.
Published: (2025)
by: Ge, Wanying, et al.
Published: (2025)
FRCRN: Boosting Feature Representation using Frequency Recurrence for Monaural Speech Enhancement
by: Zhao, Shengkui, et al.
Published: (2022)
by: Zhao, Shengkui, et al.
Published: (2022)
Towards Data Drift Monitoring for Speech Deepfake Detection in the context of MLOps
by: Wang, Xin, et al.
Published: (2025)
by: Wang, Xin, et al.
Published: (2025)
Post-training for Deepfake Speech Detection
by: Ge, Wanying, et al.
Published: (2025)
by: Ge, Wanying, et al.
Published: (2025)
Does Fine-tuning by Reinforcement Learning Improve Generalization in Binary Speech Deepfake Detection?
by: Wang, Xin, et al.
Published: (2026)
by: Wang, Xin, et al.
Published: (2026)
Code-switching Speech Recognition Under the Lens: Model- and Data-Centric Perspectives
by: Liu, Hexin, et al.
Published: (2025)
by: Liu, Hexin, et al.
Published: (2025)
Hybrid Real- And Complex-Valued Neural Network Concept For Low-Complexity Phase-Aware Speech Enhancement
by: Fiorio, Luan Vinícius, et al.
Published: (2025)
by: Fiorio, Luan Vinícius, et al.
Published: (2025)
MeanSE: Efficient Generative Speech Enhancement with Mean Flows
by: Wang, Jiahe, et al.
Published: (2025)
by: Wang, Jiahe, et al.
Published: (2025)
Test-Time Adaptation For Speech Enhancement Via Mask Polarization
by: Raichle, Tobias, et al.
Published: (2026)
by: Raichle, Tobias, et al.
Published: (2026)
Leveraging Local and Global Knowledge Integration with Time-Frequency Calibrated Distillation for Speech Enhancement
by: Cheng, Jiaming, et al.
Published: (2025)
by: Cheng, Jiaming, et al.
Published: (2025)
SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech
by: Lin, Jingru, et al.
Published: (2024)
by: Lin, Jingru, et al.
Published: (2024)
Dynamic Frequency-Adaptive Knowledge Distillation for Speech Enhancement
by: Yuan, Xihao, et al.
Published: (2025)
by: Yuan, Xihao, et al.
Published: (2025)
MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
by: Lu, Ye-Xin, et al.
Published: (2023)
by: Lu, Ye-Xin, et al.
Published: (2023)
Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network
by: Chen, Yanan, et al.
Published: (2024)
by: Chen, Yanan, et al.
Published: (2024)
CFMDCTCodec: A Low-Bitrate Neural Speech Codec with Noise-Prior-aware Conditional Flow Matching for MDCT-Spectral Enhancement
by: Jiang, Xiao-Hang, et al.
Published: (2026)
by: Jiang, Xiao-Hang, et al.
Published: (2026)
Sparsity-Driven EEG Channel Selection for Brain-Assisted Speech Enhancement
by: Zhang, Jie, et al.
Published: (2023)
by: Zhang, Jie, et al.
Published: (2023)
Investigation of Speech and Noise Latent Representations in Single-channel VAE-based Speech Enhancement
by: Li, Jiatong, et al.
Published: (2025)
by: Li, Jiatong, et al.
Published: (2025)
Evaluating the Expressive Appropriateness of Speech in Rich Contexts
by: Wang, Tianrui, et al.
Published: (2026)
by: Wang, Tianrui, et al.
Published: (2026)
Test-Time Adaptation for Speech Enhancement via Domain Invariant Embedding Transformation
by: Raichle, Tobias, et al.
Published: (2025)
by: Raichle, Tobias, et al.
Published: (2025)
ARiSE: Auto-Regressive Multi-Channel Speech Enhancement
by: Shen, Pengjie, et al.
Published: (2025)
by: Shen, Pengjie, et al.
Published: (2025)
Advancing Electrolaryngeal Speech Enhancement Through Speech-Text Representation Learning
by: Ma, Ding, et al.
Published: (2026)
by: Ma, Ding, et al.
Published: (2026)
Leveraging Discriminative Latent Representations for Conditioning GAN-Based Speech Enhancement
by: Shetu, Shrishti Saha, et al.
Published: (2025)
by: Shetu, Shrishti Saha, et al.
Published: (2025)
Predictive-Generative Drift Decomposition for Speech Enhancement and Separation
by: Richter, Julius, et al.
Published: (2026)
by: Richter, Julius, et al.
Published: (2026)
Towards Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training
by: Yang, Yifan, et al.
Published: (2026)
by: Yang, Yifan, et al.
Published: (2026)
FastEnhancer: Speed-Optimized Streaming Neural Speech Enhancement
by: Ahn, Sunghwan, et al.
Published: (2025)
by: Ahn, Sunghwan, et al.
Published: (2025)
Binaural Selective Attention Model for Target Speaker Extraction
by: Meng, Hanyu, et al.
Published: (2024)
by: Meng, Hanyu, et al.
Published: (2024)
Similar Items
-
Learning Time-Graph Frequency Representation for Monaural Speech Enhancement
by: Wang, Tingting, et al.
Published: (2025) -
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation
by: Ge, Zirui, et al.
Published: (2023) -
LORT: Locally Refined Convolution and Taylor Transformer for Monaural Speech Enhancement
by: Wang, Junyu, et al.
Published: (2025) -
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2024) -
Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement
by: Wang, Junyu, et al.
Published: (2024)