Saved in:
| Main Authors: | Zhang, Yuewei, Zou, Huanbin, Zhu, Jie |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.10494 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Speech Enhancement with Overlapped-Frame Information Fusion and Causal Self-Attention
by: Zhang, Yuewei, et al.
Published: (2025)
by: Zhang, Yuewei, et al.
Published: (2025)
A Two-Stage Hierarchical Deep Filtering Framework for Real-Time Speech Enhancement
by: Lu, Shenghui, et al.
Published: (2025)
by: Lu, Shenghui, et al.
Published: (2025)
LABNet: A Lightweight Attentive Beamforming Network for Ad-hoc Multichannel Microphone Invariant Real-Time Speech Enhancement
by: Yan, Haoyin, et al.
Published: (2025)
by: Yan, Haoyin, et al.
Published: (2025)
Dense-TSNet: Dense Connected Two-Stage Structure for Ultra-Lightweight Speech Enhancement
by: Lin, Zizhen, et al.
Published: (2024)
by: Lin, Zizhen, et al.
Published: (2024)
A Multi-Stage Framework for Multimodal Controllable Speech Synthesis
by: Niu, Rui, et al.
Published: (2025)
by: Niu, Rui, et al.
Published: (2025)
Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network
by: Chen, Yanan, et al.
Published: (2024)
by: Chen, Yanan, et al.
Published: (2024)
A Lightweight and Real-Time Binaural Speech Enhancement Model with Spatial Cues Preservation
by: Wang, Jingyuan, et al.
Published: (2024)
by: Wang, Jingyuan, et al.
Published: (2024)
LiSenNet: Lightweight Sub-band and Dual-Path Modeling for Real-Time Speech Enhancement
by: Yan, Haoyin, et al.
Published: (2024)
by: Yan, Haoyin, et al.
Published: (2024)
From Continuous to Discrete: Cross-Domain Collaborative General Speech Enhancement via Hierarchical Language Models
by: Mu, Zhaoxi, et al.
Published: (2025)
by: Mu, Zhaoxi, et al.
Published: (2025)
Geometry-Constrained EEG Channel Selection for Brain-Assisted Speech Enhancement
by: Zuo, Keying, et al.
Published: (2024)
by: Zuo, Keying, et al.
Published: (2024)
Improving Speech Enhancement by Cross- and Sub-band Processing with State Space Model
by: Li, Jizhen, et al.
Published: (2025)
by: Li, Jizhen, et al.
Published: (2025)
RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization
by: Yang, Bing, et al.
Published: (2024)
by: Yang, Bing, et al.
Published: (2024)
SaD: A Scenario-Aware Discriminator for Speech Enhancement
by: Yuan, Xihao, et al.
Published: (2025)
by: Yuan, Xihao, et al.
Published: (2025)
Dynamic Frequency-Adaptive Knowledge Distillation for Speech Enhancement
by: Yuan, Xihao, et al.
Published: (2025)
by: Yuan, Xihao, et al.
Published: (2025)
A Domain Adaptation Framework for Speech Recognition Systems with Only Synthetic data
by: Tran, Minh, et al.
Published: (2025)
by: Tran, Minh, et al.
Published: (2025)
GAP-URGENet: A Generative-Predictive Fusion Framework for Universal Speech Enhancement
by: Rong, Xiaobin, et al.
Published: (2026)
by: Rong, Xiaobin, et al.
Published: (2026)
Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation
by: Zhu, Qiushi, et al.
Published: (2024)
by: Zhu, Qiushi, et al.
Published: (2024)
Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module
by: Cui, Zhongjian, et al.
Published: (2025)
by: Cui, Zhongjian, et al.
Published: (2025)
Adaptive Convolution for CNN-based Speech Enhancement Models
by: Wang, Dahan, et al.
Published: (2025)
by: Wang, Dahan, et al.
Published: (2025)
Metadata-Enhanced Speech Emotion Recognition: Augmented Residual Integration and Co-Attention in Two-Stage Fine-Tuning
by: Wan, Zixiang, et al.
Published: (2024)
by: Wan, Zixiang, et al.
Published: (2024)
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2024)
by: Zhang, Qiquan, et al.
Published: (2024)
Context-Aware Two-Step Training Scheme for Domain Invariant Speech Separation
by: Wang, Wupeng, et al.
Published: (2025)
by: Wang, Wupeng, et al.
Published: (2025)
DroFiT: A Lightweight Band-fused Frequency Attention Toward Real-time UAV Speech Enhancement
by: Lee, Jeongmin, et al.
Published: (2025)
by: Lee, Jeongmin, et al.
Published: (2025)
Attention-Based Beamformer For Multi-Channel Speech Enhancement
by: Bai, Jinglin, et al.
Published: (2024)
by: Bai, Jinglin, et al.
Published: (2024)
ICASSP 2026 URGENT Speech Enhancement Challenge
by: Li, Chenda, et al.
Published: (2026)
by: Li, Chenda, et al.
Published: (2026)
SpecTokenizer: A Lightweight Streaming Codec in the Compressed Spectrum Domain
by: Wan, Zixiang, et al.
Published: (2025)
by: Wan, Zixiang, et al.
Published: (2025)
Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables
by: Dementyev, Artem, et al.
Published: (2024)
by: Dementyev, Artem, et al.
Published: (2024)
Improving Design of Input Condition Invariant Speech Enhancement
by: Zhang, Wangyou, et al.
Published: (2024)
by: Zhang, Wangyou, et al.
Published: (2024)
Robust Speech Recognition with Schrödinger Bridge-Based Speech Enhancement
by: Nasretdinov, Rauf, et al.
Published: (2025)
by: Nasretdinov, Rauf, et al.
Published: (2025)
FUSE: Universal Speech Enhancement using Multi-Stage Fusion of Sparse Compression and Token Generation Models for the URGENT 2025 Challenge
by: Goswami, Nabarun, et al.
Published: (2025)
by: Goswami, Nabarun, et al.
Published: (2025)
Universal Robust Speech Adaptation for Cross-Domain Speech Recognition and Enhancement
by: Wang, Chien-Chun, et al.
Published: (2026)
by: Wang, Chien-Chun, et al.
Published: (2026)
Absorbing Discrete Diffusion for Speech Enhancement
by: Gonzalez, Philippe
Published: (2026)
by: Gonzalez, Philippe
Published: (2026)
Leveraging Local and Global Knowledge Integration with Time-Frequency Calibrated Distillation for Speech Enhancement
by: Cheng, Jiaming, et al.
Published: (2025)
by: Cheng, Jiaming, et al.
Published: (2025)
Two-stage Audio-Visual Target Speaker Extraction System for Real-Time Processing On Edge Device
by: Li, Zixuan, et al.
Published: (2025)
by: Li, Zixuan, et al.
Published: (2025)
Improved Remixing Process for Domain Adaptation-Based Speech Enhancement by Mitigating Data Imbalance in Signal-to-Noise Ratio
by: Li, Li, et al.
Published: (2024)
by: Li, Li, et al.
Published: (2024)
Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
by: Zhang, Wangyou, et al.
Published: (2024)
by: Zhang, Wangyou, et al.
Published: (2024)
EDNet: A Versatile Speech Enhancement Framework with Gating Mamba Mechanism and Phase Shift-Invariant Training
by: Kwak, Doyeop, et al.
Published: (2025)
by: Kwak, Doyeop, et al.
Published: (2025)
FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement
by: Hao, Xiang, et al.
Published: (2020)
by: Hao, Xiang, et al.
Published: (2020)
ED-TTS: Multi-Scale Emotion Modeling using Cross-Domain Emotion Diarization for Emotional Speech Synthesis
by: Tang, Haobin, et al.
Published: (2024)
by: Tang, Haobin, et al.
Published: (2024)
Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling
by: Zhang, Leying, et al.
Published: (2024)
by: Zhang, Leying, et al.
Published: (2024)
Similar Items
-
Speech Enhancement with Overlapped-Frame Information Fusion and Causal Self-Attention
by: Zhang, Yuewei, et al.
Published: (2025) -
A Two-Stage Hierarchical Deep Filtering Framework for Real-Time Speech Enhancement
by: Lu, Shenghui, et al.
Published: (2025) -
LABNet: A Lightweight Attentive Beamforming Network for Ad-hoc Multichannel Microphone Invariant Real-Time Speech Enhancement
by: Yan, Haoyin, et al.
Published: (2025) -
Dense-TSNet: Dense Connected Two-Stage Structure for Ultra-Lightweight Speech Enhancement
by: Lin, Zizhen, et al.
Published: (2024) -
A Multi-Stage Framework for Multimodal Controllable Speech Synthesis
by: Niu, Rui, et al.
Published: (2025)