Guardado en:
| Autores principales: | Luo, Longjie, Li, Lin, Hong, Qingyang |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2505.24450 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Pseudo Labels-based Neural Speech Enhancement for the AVSR Task in the MISP-Meeting Challenge
por: Luo, Longjie, et al.
Publicado: (2025)
por: Luo, Longjie, et al.
Publicado: (2025)
ctPuLSE: Close-Talk, and Pseudo-Label Based Far-Field, Speech Enhancement
por: Wang, Zhong-Qiu
Publicado: (2024)
por: Wang, Zhong-Qiu
Publicado: (2024)
Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation
por: Lin, Zhaofeng, et al.
Publicado: (2023)
por: Lin, Zhaofeng, et al.
Publicado: (2023)
Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach
por: Lin, Yi-Cheng, et al.
Publicado: (2025)
por: Lin, Yi-Cheng, et al.
Publicado: (2025)
Robust Speech Recognition with Schrödinger Bridge-Based Speech Enhancement
por: Nasretdinov, Rauf, et al.
Publicado: (2025)
por: Nasretdinov, Rauf, et al.
Publicado: (2025)
A Two-Stage Hierarchical Deep Filtering Framework for Real-Time Speech Enhancement
por: Lu, Shenghui, et al.
Publicado: (2025)
por: Lu, Shenghui, et al.
Publicado: (2025)
Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network
por: Chen, Yanan, et al.
Publicado: (2024)
por: Chen, Yanan, et al.
Publicado: (2024)
Enhancing Code-Switching Speech Recognition with LID-Based Collaborative Mixture of Experts Model
por: Huang, Hukai, et al.
Publicado: (2024)
por: Huang, Hukai, et al.
Publicado: (2024)
Few-Shot and Pseudo-Label Guided Speech Quality Evaluation with Large Language Models
por: Zezario, Ryandhimas E., et al.
Publicado: (2026)
por: Zezario, Ryandhimas E., et al.
Publicado: (2026)
Adaptive Data Augmentation with NaturalSpeech3 for Far-field Speaker Verification
por: Zhang, Li, et al.
Publicado: (2025)
por: Zhang, Li, et al.
Publicado: (2025)
Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features
por: Tsunoo, Emiru, et al.
Publicado: (2024)
por: Tsunoo, Emiru, et al.
Publicado: (2024)
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
por: Ochiai, Tsubasa, et al.
Publicado: (2024)
por: Ochiai, Tsubasa, et al.
Publicado: (2024)
ReFlow-TTS: A Rectified Flow Model for High-fidelity Text-to-Speech
por: Guan, Wenhao, et al.
Publicado: (2023)
por: Guan, Wenhao, et al.
Publicado: (2023)
Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
por: Wang, Kuan-Chen, et al.
Publicado: (2024)
por: Wang, Kuan-Chen, et al.
Publicado: (2024)
Advancing Electrolaryngeal Speech Enhancement Through Speech-Text Representation Learning
por: Ma, Ding, et al.
Publicado: (2026)
por: Ma, Ding, et al.
Publicado: (2026)
GMP-TL: Gender-augmented Multi-scale Pseudo-label Enhanced Transfer Learning for Speech Emotion Recognition
por: Pan, Yu, et al.
Publicado: (2024)
por: Pan, Yu, et al.
Publicado: (2024)
DS-Codec: Dual-Stage Training with Mirror-to-NonMirror Architecture Switching for Speech Codec
por: Chen, Peijie, et al.
Publicado: (2025)
por: Chen, Peijie, et al.
Publicado: (2025)
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
por: Guan, Wenhao, et al.
Publicado: (2023)
por: Guan, Wenhao, et al.
Publicado: (2023)
Phoenix-VAD: Streaming Semantic Endpoint Detection for Full-Duplex Speech Interaction
por: Wu, Weijie, et al.
Publicado: (2025)
por: Wu, Weijie, et al.
Publicado: (2025)
WhispEar: A Bi-directional Framework for Scaling Whispered Speech Conversion via Pseudo-Parallel Whisper Generation
por: Fang, Zihao, et al.
Publicado: (2026)
por: Fang, Zihao, et al.
Publicado: (2026)
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model
por: Zezario, Ryandhimas E., et al.
Publicado: (2023)
por: Zezario, Ryandhimas E., et al.
Publicado: (2023)
Latent-Level Enhancement with Flow Matching for Robust Automatic Speech Recognition
por: Yang, Da-Hee, et al.
Publicado: (2026)
por: Yang, Da-Hee, et al.
Publicado: (2026)
Efficient Long-Form Speech Recognition for General Speech In-Context Learning
por: Yen, Hao, et al.
Publicado: (2024)
por: Yen, Hao, et al.
Publicado: (2024)
Speaker Diarization with Overlapping Community Detection Using Graph Attention Networks and Label Propagation Algorithm
por: Li, Zhaoyang, et al.
Publicado: (2025)
por: Li, Zhaoyang, et al.
Publicado: (2025)
PseudoVC: Improving One-shot Voice Conversion with Pseudo Paired Data
por: Cao, Songjun, et al.
Publicado: (2025)
por: Cao, Songjun, et al.
Publicado: (2025)
Position-invariant Fine-tuning of Speech Enhancement Models with Self-supervised Speech Representations
por: Meghanani, Amit, et al.
Publicado: (2026)
por: Meghanani, Amit, et al.
Publicado: (2026)
Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
por: Lin, Guan-Ting, et al.
Publicado: (2024)
por: Lin, Guan-Ting, et al.
Publicado: (2024)
Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module
por: Cui, Zhongjian, et al.
Publicado: (2025)
por: Cui, Zhongjian, et al.
Publicado: (2025)
Advanced Long-Content Speech Recognition With Factorized Neural Transducer
por: Gong, Xun, et al.
Publicado: (2024)
por: Gong, Xun, et al.
Publicado: (2024)
Multichannel Long-Term Streaming Neural Speech Enhancement for Static and Moving Speakers
por: Quan, Changsheng, et al.
Publicado: (2024)
por: Quan, Changsheng, et al.
Publicado: (2024)
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
por: Ren, Wenze, et al.
Publicado: (2024)
por: Ren, Wenze, et al.
Publicado: (2024)
Probing Self-supervised Learning Models with Target Speech Extraction
por: Peng, Junyi, et al.
Publicado: (2024)
por: Peng, Junyi, et al.
Publicado: (2024)
In-Materia Speech Recognition
por: Zolfagharinejad, Mohamadreza, et al.
Publicado: (2024)
por: Zolfagharinejad, Mohamadreza, et al.
Publicado: (2024)
Speech Emotion Recognition with ASR Integration
por: Li, Yuanchao
Publicado: (2026)
por: Li, Yuanchao
Publicado: (2026)
Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
por: Wen, Wen, et al.
Publicado: (2024)
por: Wen, Wen, et al.
Publicado: (2024)
Boosting Multi-Speaker Expressive Speech Synthesis with Semi-supervised Contrastive Learning
por: Zhu, Xinfa, et al.
Publicado: (2023)
por: Zhu, Xinfa, et al.
Publicado: (2023)
Objective and Subjective Evaluation of Diffusion-Based Speech Enhancement for Dysarthric Speech
por: de Groot, Dimme, et al.
Publicado: (2025)
por: de Groot, Dimme, et al.
Publicado: (2025)
Uncovering the Visual Contribution in Audio-Visual Speech Recognition
por: Lin, Zhaofeng, et al.
Publicado: (2024)
por: Lin, Zhaofeng, et al.
Publicado: (2024)
On the Importance of Neural Wiener Filter for Resource Efficient Multichannel Speech Enhancement
por: Hsieh, Tsun-An, et al.
Publicado: (2024)
por: Hsieh, Tsun-An, et al.
Publicado: (2024)
Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
por: Shen, Siyuan, et al.
Publicado: (2024)
por: Shen, Siyuan, et al.
Publicado: (2024)
Ejemplares similares
-
Pseudo Labels-based Neural Speech Enhancement for the AVSR Task in the MISP-Meeting Challenge
por: Luo, Longjie, et al.
Publicado: (2025) -
ctPuLSE: Close-Talk, and Pseudo-Label Based Far-Field, Speech Enhancement
por: Wang, Zhong-Qiu
Publicado: (2024) -
Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation
por: Lin, Zhaofeng, et al.
Publicado: (2023) -
Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach
por: Lin, Yi-Cheng, et al.
Publicado: (2025) -
Robust Speech Recognition with Schrödinger Bridge-Based Speech Enhancement
por: Nasretdinov, Rauf, et al.
Publicado: (2025)