:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yang, Gang, Lei, Yue, Tai, Wenxin, Wu, Jin, Chen, Jia, Zhong, Ting, Zhou, Fan
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Sound Artificial Intelligence Machine Learning Audio and Speech Processing
Online-Zugang:	https://arxiv.org/abs/2509.15952
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

MeanFlowSE: One-Step Generative Speech Enhancement via MeanFlow
von: Zhu, Yike, et al.
Veröffentlicht: (2025)

Schrödinger Bridge Mamba for One-Step Speech Enhancement
von: Yang, Jing, et al.
Veröffentlicht: (2025)

DSFlow: Dual Supervision and Step-Aware Architecture for One-Step Flow Matching Speech Synthesis
von: Lin, Bin, et al.
Veröffentlicht: (2026)

FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching
von: Jung, Chaeyoung, et al.
Veröffentlicht: (2024)

Latent-Level Enhancement with Flow Matching for Robust Automatic Speech Recognition
von: Yang, Da-Hee, et al.
Veröffentlicht: (2026)

Accelerating Flow-Matching-Based Text-to-Speech via Empirically Pruned Step Sampling
von: Zheng, Qixi, et al.
Veröffentlicht: (2025)

DialoSpeech: Dual-Speaker Dialogue Generation with LLM and Flow Matching
von: Xie, Hanke, et al.
Veröffentlicht: (2025)

StreamFlow: Streaming Flow Matching with Block-wise Guided Attention Mask for Speech Token Decoding
von: Guo, Dake, et al.
Veröffentlicht: (2025)

VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
von: Zhou, Yixuan, et al.
Veröffentlicht: (2024)

Robust One-step Speech Enhancement via Consistency Distillation
von: Xu, Liang, et al.
Veröffentlicht: (2025)

Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
von: Ren, Wenze, et al.
Veröffentlicht: (2024)

Objective and Subjective Evaluation of Diffusion-Based Speech Enhancement for Dysarthric Speech
von: de Groot, Dimme, et al.
Veröffentlicht: (2025)

DUET: Unified Dual-Space Emotion Control for Diffusion and Flow-Matching Driven Text-to-Speech
von: Zhang, Xu, et al.
Veröffentlicht: (2026)

FlowW2N: Whispered-to-Normal Speech Conversion via Flow-Matching
von: Ritter-Gutierrez, Fabian, et al.
Veröffentlicht: (2026)

VoiceRestore: Flow-Matching Transformers for Speech Recording Quality Restoration
von: Kirdey, Stanislav
Veröffentlicht: (2025)

DiT-Flow: Speech Enhancement Robust to Multiple Distortions based on Flow Matching in Latent Space and Diffusion Transformers
von: Cao, Tianyu, et al.
Veröffentlicht: (2026)

ctPuLSE: Close-Talk, and Pseudo-Label Based Far-Field, Speech Enhancement
von: Wang, Zhong-Qiu
Veröffentlicht: (2024)

Speech Enhancement with Dual-path Multi-Channel Linear Prediction Filter and Multi-norm Beamforming
von: Qin, Chengyuan, et al.
Veröffentlicht: (2025)

F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization
von: Sun, Xiaohui, et al.
Veröffentlicht: (2025)

Ultra-Low Latency Speech Enhancement - A Comprehensive Study
von: Wu, Haibin, et al.
Veröffentlicht: (2024)

OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
von: Huynh-Nguyen, Hieu-Nghia, et al.
Veröffentlicht: (2025)

Dynamic Frequency-Adaptive Knowledge Distillation for Speech Enhancement
von: Yuan, Xihao, et al.
Veröffentlicht: (2025)

ZipVoice: Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching
von: Zhu, Han, et al.
Veröffentlicht: (2025)

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
von: Chen, Yushen, et al.
Veröffentlicht: (2024)

SNR-Progressive Model with Harmonic Compensation for Low-SNR Speech Enhancement
von: Hou, Zhongshu, et al.
Veröffentlicht: (2024)

Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network
von: Chen, Yanan, et al.
Veröffentlicht: (2024)

LLM-Guided Reinforcement Learning for Audio-Visual Speech Enhancement
von: Chen, Chih-Ning, et al.
Veröffentlicht: (2026)

Robust Speech Recognition with Schrödinger Bridge-Based Speech Enhancement
von: Nasretdinov, Rauf, et al.
Veröffentlicht: (2025)

SaD: A Scenario-Aware Discriminator for Speech Enhancement
von: Yuan, Xihao, et al.
Veröffentlicht: (2025)

Drax: Speech Recognition with Discrete Flow Matching
von: Navon, Aviv, et al.
Veröffentlicht: (2025)

StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
von: Yao, Jixun, et al.
Veröffentlicht: (2024)

Multi-Step Prediction and Control of Hierarchical Emotion Distribution in Text-to-Speech Synthesis
von: Inoue, Sho, et al.
Veröffentlicht: (2025)

Absorbing Discrete Diffusion for Speech Enhancement
von: Gonzalez, Philippe
Veröffentlicht: (2026)

BSDB-Net: Band-Split Dual-Branch Network with Selective State Spaces Mechanism for Monaural Speech Enhancement
von: Fan, Cunhang, et al.
Veröffentlicht: (2024)

Dual-View Predictive Diffusion: Lightweight Speech Enhancement via Spectrogram-Image Synergy
von: Xue, Ke, et al.
Veröffentlicht: (2026)

Leveraging Mamba with Full-Face Vision for Audio-Visual Speech Enhancement
von: Chao, Rong, et al.
Veröffentlicht: (2025)

Advancing Electrolaryngeal Speech Enhancement Through Speech-Text Representation Learning
von: Ma, Ding, et al.
Veröffentlicht: (2026)

InstructSing: High-Fidelity Singing Voice Generation via Instructing Yourself
von: Zeng, Chang, et al.
Veröffentlicht: (2024)

GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling
von: Yao, Jixun, et al.
Veröffentlicht: (2025)

DiTSE: High-Fidelity Generative Speech Enhancement via Latent Diffusion Transformers
von: Guimarães, Heitor R., et al.
Veröffentlicht: (2025)