:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Chihyun, Fan, Jiaxuan, Sun, Mingtung, Anthony, Michael, Bai, Mingsian R., Tsao, Yu
Format:	Preprint
Published:	2026
Subjects:	Audio and Speech Processing Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.05270
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Egonoise Resilient Source Localization and Speech Enhancement for Drones Using a Hybrid Model and Learning-Based Approach
by: Wu, Yihsuan, et al.
Published: (2025)

Attention-Based Beamformer For Multi-Channel Speech Enhancement
by: Bai, Jinglin, et al.
Published: (2024)

Spatial-Temporal Activity-Informed Diarization and Separation
by: Hsu, Yicheng, et al.
Published: (2024)

LLM-Guided Reinforcement Learning for Audio-Visual Speech Enhancement
by: Chen, Chih-Ning, et al.
Published: (2026)

A tunable binaural audio telepresence system capable of balancing immersive and enhanced modes
by: Hsu, Yicheng, et al.
Published: (2024)

Tracking Listener Attention: Gaze-Guided Audio-Visual Speech Enhancement Framework
by: Yang, Hsiang-Cheng, et al.
Published: (2026)

Audio-Visual Speech Enhancement in Noisy Environments via Emotion-Based Contextual Cues
by: Hussain, Tassadaq, et al.
Published: (2024)

Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
by: Fu, Szu-Wei, et al.
Published: (2024)

An Investigation of Incorporating Mamba for Speech Enhancement
by: Chao, Rong, et al.
Published: (2024)

Unsupervised Face-Masked Speech Enhancement Using Generative Adversarial Networks With Human-in-the-Loop Assessment Metrics
by: Wang, Syu-Siang, et al.
Published: (2024)

From Evaluation to Optimization: Neural Speech Assessment for Downstream Applications
by: Tsao, Yu
Published: (2025)

Heterogeneous Space Fusion and Dual-Dimension Attention: A New Paradigm for Speech Enhancement
by: Zheng, Tao, et al.
Published: (2024)

Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
by: Ren, Wenze, et al.
Published: (2024)

Leveraging Mamba with Full-Face Vision for Audio-Visual Speech Enhancement
by: Chao, Rong, et al.
Published: (2025)

Universal Speech Enhancement with Regression and Generative Mamba
by: Chao, Rong, et al.
Published: (2025)

EffortNet: A Deep Learning Framework for Objective Assessment of Speech Enhancement Technologies Using EEG-Based Alpha Oscillations
by: Sung, Ching-Chih, et al.
Published: (2025)

BSS-CFFMA: Cross-Domain Feature Fusion and Multi-Attention Speech Enhancement Network based on Self-Supervised Embedding
by: Mattursun, Alimjan, et al.
Published: (2024)

An Investigation on Combining Geometry and Consistency Constraints into Phase Estimation for Speech Enhancement
by: Ho, Chun-Wei, et al.
Published: (2025)

MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement
by: Kühne, Nikolai Lund, et al.
Published: (2025)

Mixture to Beamformed Mixture: Leveraging Beamformed Mixture as Weak-Supervision for Speech Enhancement and Noise-Robust ASR
by: Wang, Zhong-Qiu, et al.
Published: (2025)

Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASR
by: Lu, Xugang, et al.
Published: (2025)

Speech Enhancement Based on Drifting Models
by: Xu, Liang, et al.
Published: (2026)

Towards Environmental Preference Based Speech Enhancement For Individualised Multi-Modal Hearing Aids
by: Kirton-Wingate, Jasper, et al.
Published: (2024)

Active Speech Enhancement: Active Speech Denoising Decliping and Deveraberation
by: Yaish, Ofir, et al.
Published: (2025)

HyBeam: Hybrid Microphone-Beamforming Array-Agnostic Speech Enhancement for Wearables
by: Ilan, Yuval Bar, et al.
Published: (2025)

Do we really need Self-Attention for Streaming Automatic Speech Recognition?
by: Dkhissi, Youness, et al.
Published: (2026)

Speech Enhancement Using Continuous Embeddings of Neural Audio Codec
by: Li, Haoyang, et al.
Published: (2025)

A Study on Speech Assessment with Visual Cues
by: Ahmed, Shafique, et al.
Published: (2025)

AmbiDrop: Array-Agnostic Speech Enhancement Using Ambisonics Encoding and Dropout-Based Learning
by: Tatarjitzky, Michael, et al.
Published: (2025)

Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement
by: Bae, Jae-Sung, et al.
Published: (2025)

Linguistic Knowledge Transfer Learning for Speech Enhancement
by: Hung, Kuo-Hsuan, et al.
Published: (2025)

Joint Learning using Mixture-of-Expert-Based Representation for Speech Enhancement and Robust Emotion Recognition
by: Tzeng, Jing-Tong, et al.
Published: (2025)

Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
by: Wang, Kuan-Chen, et al.
Published: (2024)

The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge
by: Guo, Yiwei, et al.
Published: (2024)

A Mel Spectrogram Enhancement Paradigm Based on CWT in Speech Synthesis
by: Hu, Guoqiang, et al.
Published: (2024)

Continuous Modeling of the Denoising Process for Speech Enhancement Based on Deep Learning
by: Guo, Zilu, et al.
Published: (2023)

Exploiting Consistency-Preserving Loss and Perceptual Contrast Stretching to Boost SSL-based Speech Enhancement
by: Khan, Muhammad Salman, et al.
Published: (2024)

A Dual-Branch Parallel Network for Speech Enhancement and Restoration
by: Yang, Da-Hee, et al.
Published: (2024)

Speech-to-Speech Translation with Discrete-Unit-Based Style Transfer
by: Wang, Yongqi, et al.
Published: (2023)

Exploring Resolution-Wise Shared Attention in Hybrid Mamba-U-Nets for Improved Cross-Corpus Speech Enhancement
by: Kühne, Nikolai Lund, et al.
Published: (2025)