Saved in:
| Main Authors: | Luo, Longjie, Lu, Shenghui, Li, Lin, Hong, Qingyang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.24446 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SuPseudo: A Pseudo-supervised Learning Method for Neural Speech Enhancement in Far-field Speech Recognition
by: Luo, Longjie, et al.
Published: (2025)
by: Luo, Longjie, et al.
Published: (2025)
A Two-Stage Hierarchical Deep Filtering Framework for Real-Time Speech Enhancement
by: Lu, Shenghui, et al.
Published: (2025)
by: Lu, Shenghui, et al.
Published: (2025)
Speaker Diarization with Overlapping Community Detection Using Graph Attention Networks and Label Propagation Algorithm
by: Li, Zhaoyang, et al.
Published: (2025)
by: Li, Zhaoyang, et al.
Published: (2025)
An audio-quality-based multi-strategy approach for target speaker extraction in the MISP 2023 Challenge
by: Han, Runduo, et al.
Published: (2024)
by: Han, Runduo, et al.
Published: (2024)
ctPuLSE: Close-Talk, and Pseudo-Label Based Far-Field, Speech Enhancement
by: Wang, Zhong-Qiu
Published: (2024)
by: Wang, Zhong-Qiu
Published: (2024)
The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition
by: Gao, Ming, et al.
Published: (2025)
by: Gao, Ming, et al.
Published: (2025)
DCIM-AVSR : Efficient Audio-Visual Speech Recognition via Dual Conformer Interaction Module
by: Wang, Xinyu, et al.
Published: (2024)
by: Wang, Xinyu, et al.
Published: (2024)
Overlap-Adaptive Hybrid Speaker Diarization and ASR-Aware Observation Addition for MISP 2025 Challenge
by: Huang, Shangkun, et al.
Published: (2025)
by: Huang, Shangkun, et al.
Published: (2025)
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
by: Guan, Wenhao, et al.
Published: (2023)
by: Guan, Wenhao, et al.
Published: (2023)
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition
by: Wang, He, et al.
Published: (2024)
by: Wang, He, et al.
Published: (2024)
ICASSP 2026 URGENT Speech Enhancement Challenge
by: Li, Chenda, et al.
Published: (2026)
by: Li, Chenda, et al.
Published: (2026)
Channel Adaptation for Speaker Verification Using Optimal Transport with Pseudo Label
by: Yang, Wenhao, et al.
Published: (2024)
by: Yang, Wenhao, et al.
Published: (2024)
XMUspeech Systems for the ASVspoof 5 Challenge
by: Li, Wangjie, et al.
Published: (2025)
by: Li, Wangjie, et al.
Published: (2025)
ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
by: Guan, Wenhao, et al.
Published: (2023)
by: Guan, Wenhao, et al.
Published: (2023)
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
by: Zhang, Wangyou, et al.
Published: (2024)
by: Zhang, Wangyou, et al.
Published: (2024)
DS-Codec: Dual-Stage Training with Mirror-to-NonMirror Architecture Switching for Speech Codec
by: Chen, Peijie, et al.
Published: (2025)
by: Chen, Peijie, et al.
Published: (2025)
AD-AVSR: Asymmetric Dual-stream Enhancement for Robust Audio-Visual Speech Recognition
by: Xue, Junxiao, et al.
Published: (2025)
by: Xue, Junxiao, et al.
Published: (2025)
Few-Shot and Pseudo-Label Guided Speech Quality Evaluation with Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2026)
by: Zezario, Ryandhimas E., et al.
Published: (2026)
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model
by: Zezario, Ryandhimas E., et al.
Published: (2023)
by: Zezario, Ryandhimas E., et al.
Published: (2023)
Adaptive Convolution for CNN-based Speech Enhancement Models
by: Wang, Dahan, et al.
Published: (2025)
by: Wang, Dahan, et al.
Published: (2025)
Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation
by: Lin, Zhaofeng, et al.
Published: (2023)
by: Lin, Zhaofeng, et al.
Published: (2023)
Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network
by: Chen, Yanan, et al.
Published: (2024)
by: Chen, Yanan, et al.
Published: (2024)
Phoenix-VAD: Streaming Semantic Endpoint Detection for Full-Duplex Speech Interaction
by: Wu, Weijie, et al.
Published: (2025)
by: Wu, Weijie, et al.
Published: (2025)
The TMU System for the XACLE Challenge: Training Large Audio Language Models with CLAP Pseudo-Labels
by: Tsutsumi, Ayuto, et al.
Published: (2026)
by: Tsutsumi, Ayuto, et al.
Published: (2026)
Multichannel Long-Term Streaming Neural Speech Enhancement for Static and Moving Speakers
by: Quan, Changsheng, et al.
Published: (2024)
by: Quan, Changsheng, et al.
Published: (2024)
Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach
by: Lin, Yi-Cheng, et al.
Published: (2025)
by: Lin, Yi-Cheng, et al.
Published: (2025)
Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-Labels
by: Aldeneh, Zakaria, et al.
Published: (2024)
by: Aldeneh, Zakaria, et al.
Published: (2024)
Dynamic Frequency-Adaptive Knowledge Distillation for Speech Enhancement
by: Yuan, Xihao, et al.
Published: (2025)
by: Yuan, Xihao, et al.
Published: (2025)
Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
by: Wen, Wen, et al.
Published: (2024)
by: Wen, Wen, et al.
Published: (2024)
A Lightweight Fourier-based Network for Binaural Speech Enhancement with Spatial Cue Preservation
by: Lu, Xikun, et al.
Published: (2025)
by: Lu, Xikun, et al.
Published: (2025)
Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement
by: Li, Chenda, et al.
Published: (2024)
by: Li, Chenda, et al.
Published: (2024)
Diffusion-based Signal Refiner for Speech Enhancement and Separation
by: Hirano, Masato, et al.
Published: (2023)
by: Hirano, Masato, et al.
Published: (2023)
Lessons Learned from the URGENT 2024 Speech Enhancement Challenge
by: Zhang, Wangyou, et al.
Published: (2025)
by: Zhang, Wangyou, et al.
Published: (2025)
Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models
by: Cappellazzo, Umberto, et al.
Published: (2025)
by: Cappellazzo, Umberto, et al.
Published: (2025)
On the Importance of Neural Wiener Filter for Resource Efficient Multichannel Speech Enhancement
by: Hsieh, Tsun-An, et al.
Published: (2024)
by: Hsieh, Tsun-An, et al.
Published: (2024)
Enhancing Code-Switching Speech Recognition with LID-Based Collaborative Mixture of Experts Model
by: Huang, Hukai, et al.
Published: (2024)
by: Huang, Hukai, et al.
Published: (2024)
Dense-TSNet: Dense Connected Two-Stage Structure for Ultra-Lightweight Speech Enhancement
by: Lin, Zizhen, et al.
Published: (2024)
by: Lin, Zizhen, et al.
Published: (2024)
Low-Latency Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks
by: Ai, Yang, et al.
Published: (2024)
by: Ai, Yang, et al.
Published: (2024)
PrimeK-Net: Multi-scale Spectral Learning via Group Prime-Kernel Convolutional Neural Networks for Single Channel Speech Enhancement
by: Lin, Zizhen, et al.
Published: (2025)
by: Lin, Zizhen, et al.
Published: (2025)
Universal Score-based Speech Enhancement with High Content Preservation
by: Scheibler, Robin, et al.
Published: (2024)
by: Scheibler, Robin, et al.
Published: (2024)
Similar Items
-
SuPseudo: A Pseudo-supervised Learning Method for Neural Speech Enhancement in Far-field Speech Recognition
by: Luo, Longjie, et al.
Published: (2025) -
A Two-Stage Hierarchical Deep Filtering Framework for Real-Time Speech Enhancement
by: Lu, Shenghui, et al.
Published: (2025) -
Speaker Diarization with Overlapping Community Detection Using Graph Attention Networks and Label Propagation Algorithm
by: Li, Zhaoyang, et al.
Published: (2025) -
An audio-quality-based multi-strategy approach for target speaker extraction in the MISP 2023 Challenge
by: Han, Runduo, et al.
Published: (2024) -
ctPuLSE: Close-Talk, and Pseudo-Label Based Far-Field, Speech Enhancement
by: Wang, Zhong-Qiu
Published: (2024)