:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fu, Szu-Wei, Chao, Rong, Yang, Xuesong, Huang, Sung-Feng, Zezario, Ryandhimas E., Nasretdinov, Rauf, Jukić, Ante, Tsao, Yu, Wang, Yu-Chiang Frank
Format:	Preprint
Published:	2026
Subjects:	Sound
Online Access:	https://arxiv.org/abs/2603.02641
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Universal Speech Enhancement with Regression and Generative Mamba
by: Chao, Rong, et al.
Published: (2025)

Robust Speech Recognition with Schrödinger Bridge-Based Speech Enhancement
by: Nasretdinov, Rauf, et al.
Published: (2025)

A Study on Incorporating Whisper for Robust Speech Assessment
by: Zezario, Ryandhimas E., et al.
Published: (2023)

Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
by: Fu, Szu-Wei, et al.
Published: (2024)

Few-Shot and Pseudo-Label Guided Speech Quality Evaluation with Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2026)

The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
by: Huang, Wen-Chin, et al.
Published: (2024)

Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features
by: Zezario, Ryandhimas E., et al.
Published: (2021)

A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2024)

Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration
by: Ku, Pin-Jui, et al.
Published: (2024)

Non-Intrusive Intelligibility Prediction for Hearing Aids: Recent Advances, Trends, and Challenges
by: Zezario, Ryandhimas E.
Published: (2025)

Speech Intelligibility Assessment with Uncertainty-Aware Whisper Embeddings and sLSTM
by: Zezario, Ryandhimas E., et al.
Published: (2025)

Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model
by: Zezario, Ryandhimas E., et al.
Published: (2023)

Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids
by: Zezario, Ryandhimas E., et al.
Published: (2025)

A Study on Zero-Shot Non-Intrusive Speech Intelligibility for Hearing Aids Using Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2025)

Non-Intrusive Speech Intelligibility Prediction for Hearing Aids using Whisper and Metadata
by: Zezario, Ryandhimas E., et al.
Published: (2023)

A Study on Speech Assessment with Visual Cues
by: Ahmed, Shafique, et al.
Published: (2025)

Leveraging Mamba with Full-Face Vision for Audio-Visual Speech Enhancement
by: Chao, Rong, et al.
Published: (2025)

HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids
by: Wisnu, Dyah A. M. G., et al.
Published: (2024)

An Investigation of Incorporating Mamba for Speech Enhancement
by: Chao, Rong, et al.
Published: (2024)

STSM-FiLM: A FiLM-Conditioned Neural Architecture for Time-Scale Modification of Speech
by: Wisnu, Dyah A. M. G., et al.
Published: (2025)

Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
by: Huang, Sung-Feng, et al.
Published: (2025)

Improving Perceptual Audio Aesthetic Assessment via Triplet Loss and Self-Supervised Embeddings
by: Wisnu, Dyah A. M. G., et al.
Published: (2025)

Neuro-MSBG: An End-to-End Neural Model for Hearing Loss Simulation
by: Yuan, Hui-Guan, et al.
Published: (2025)

NanoCodec: Towards High-Quality Ultra Fast Speech LLM Inference
by: Casanova, Edresson, et al.
Published: (2025)

Exploiting Consistency-Preserving Loss and Perceptual Contrast Stretching to Boost SSL-based Speech Enhancement
by: Khan, Muhammad Salman, et al.
Published: (2024)

NeuroAMP: A Novel End-to-end General Purpose Deep Neural Amplifier for Personalized Hearing Aids
by: Ahmed, Shafique, et al.
Published: (2025)

Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
by: Ren, Wenze, et al.
Published: (2024)

Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference
by: Casanova, Edresson, et al.
Published: (2024)

RankUp: Boosting Semi-Supervised Regression with an Auxiliary Ranking Classifier
by: Huang, Pin-Yen, et al.
Published: (2024)

DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
by: Lu, Ke-Han, et al.
Published: (2024)

Audio-Visual Speech Enhancement in Noisy Environments via Emotion-Based Contextual Cues
by: Hussain, Tassadaq, et al.
Published: (2024)

Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
by: Ren, Wenze, et al.
Published: (2024)

Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
by: Wang, Kuan-Chen, et al.
Published: (2024)

Restorative Speech Enhancement: A Progressive Approach Using SE and Codec Modules
by: Chiang, Hsin-Tien, et al.
Published: (2024)

QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
by: Wang, Siyin, et al.
Published: (2025)

Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network
by: Chen, Yanan, et al.
Published: (2024)

LLM-Guided Reinforcement Learning for Audio-Visual Speech Enhancement
by: Chen, Chih-Ning, et al.
Published: (2026)

How Auditory Knowledge in LLM Backbones Shapes Audio Language Models: A Holistic Evaluation
by: Lu, Ke-Han, et al.
Published: (2026)

Universal Discrete-Domain Speech Enhancement
by: Liu, Fei, et al.
Published: (2025)

GAP-URGENet: A Generative-Predictive Fusion Framework for Universal Speech Enhancement
by: Rong, Xiaobin, et al.
Published: (2026)