Saved in:
| Main Authors: | Fu, Szu-Wei, Chao, Rong, Yang, Xuesong, Huang, Sung-Feng, Zezario, Ryandhimas E., Nasretdinov, Rauf, Jukić, Ante, Tsao, Yu, Wang, Yu-Chiang Frank |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.02641 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Universal Speech Enhancement with Regression and Generative Mamba
by: Chao, Rong, et al.
Published: (2025)
by: Chao, Rong, et al.
Published: (2025)
Robust Speech Recognition with Schrödinger Bridge-Based Speech Enhancement
by: Nasretdinov, Rauf, et al.
Published: (2025)
by: Nasretdinov, Rauf, et al.
Published: (2025)
A Study on Incorporating Whisper for Robust Speech Assessment
by: Zezario, Ryandhimas E., et al.
Published: (2023)
by: Zezario, Ryandhimas E., et al.
Published: (2023)
Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
by: Fu, Szu-Wei, et al.
Published: (2024)
by: Fu, Szu-Wei, et al.
Published: (2024)
Few-Shot and Pseudo-Label Guided Speech Quality Evaluation with Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2026)
by: Zezario, Ryandhimas E., et al.
Published: (2026)
The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
by: Huang, Wen-Chin, et al.
Published: (2024)
by: Huang, Wen-Chin, et al.
Published: (2024)
Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features
by: Zezario, Ryandhimas E., et al.
Published: (2021)
by: Zezario, Ryandhimas E., et al.
Published: (2021)
A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2024)
by: Zezario, Ryandhimas E., et al.
Published: (2024)
Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration
by: Ku, Pin-Jui, et al.
Published: (2024)
by: Ku, Pin-Jui, et al.
Published: (2024)
Non-Intrusive Intelligibility Prediction for Hearing Aids: Recent Advances, Trends, and Challenges
by: Zezario, Ryandhimas E.
Published: (2025)
by: Zezario, Ryandhimas E.
Published: (2025)
Speech Intelligibility Assessment with Uncertainty-Aware Whisper Embeddings and sLSTM
by: Zezario, Ryandhimas E., et al.
Published: (2025)
by: Zezario, Ryandhimas E., et al.
Published: (2025)
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model
by: Zezario, Ryandhimas E., et al.
Published: (2023)
by: Zezario, Ryandhimas E., et al.
Published: (2023)
Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids
by: Zezario, Ryandhimas E., et al.
Published: (2025)
by: Zezario, Ryandhimas E., et al.
Published: (2025)
A Study on Zero-Shot Non-Intrusive Speech Intelligibility for Hearing Aids Using Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2025)
by: Zezario, Ryandhimas E., et al.
Published: (2025)
Non-Intrusive Speech Intelligibility Prediction for Hearing Aids using Whisper and Metadata
by: Zezario, Ryandhimas E., et al.
Published: (2023)
by: Zezario, Ryandhimas E., et al.
Published: (2023)
A Study on Speech Assessment with Visual Cues
by: Ahmed, Shafique, et al.
Published: (2025)
by: Ahmed, Shafique, et al.
Published: (2025)
Leveraging Mamba with Full-Face Vision for Audio-Visual Speech Enhancement
by: Chao, Rong, et al.
Published: (2025)
by: Chao, Rong, et al.
Published: (2025)
HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids
by: Wisnu, Dyah A. M. G., et al.
Published: (2024)
by: Wisnu, Dyah A. M. G., et al.
Published: (2024)
An Investigation of Incorporating Mamba for Speech Enhancement
by: Chao, Rong, et al.
Published: (2024)
by: Chao, Rong, et al.
Published: (2024)
STSM-FiLM: A FiLM-Conditioned Neural Architecture for Time-Scale Modification of Speech
by: Wisnu, Dyah A. M. G., et al.
Published: (2025)
by: Wisnu, Dyah A. M. G., et al.
Published: (2025)
Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
by: Huang, Sung-Feng, et al.
Published: (2025)
by: Huang, Sung-Feng, et al.
Published: (2025)
Improving Perceptual Audio Aesthetic Assessment via Triplet Loss and Self-Supervised Embeddings
by: Wisnu, Dyah A. M. G., et al.
Published: (2025)
by: Wisnu, Dyah A. M. G., et al.
Published: (2025)
Neuro-MSBG: An End-to-End Neural Model for Hearing Loss Simulation
by: Yuan, Hui-Guan, et al.
Published: (2025)
by: Yuan, Hui-Guan, et al.
Published: (2025)
NanoCodec: Towards High-Quality Ultra Fast Speech LLM Inference
by: Casanova, Edresson, et al.
Published: (2025)
by: Casanova, Edresson, et al.
Published: (2025)
Exploiting Consistency-Preserving Loss and Perceptual Contrast Stretching to Boost SSL-based Speech Enhancement
by: Khan, Muhammad Salman, et al.
Published: (2024)
by: Khan, Muhammad Salman, et al.
Published: (2024)
NeuroAMP: A Novel End-to-end General Purpose Deep Neural Amplifier for Personalized Hearing Aids
by: Ahmed, Shafique, et al.
Published: (2025)
by: Ahmed, Shafique, et al.
Published: (2025)
Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
by: Ren, Wenze, et al.
Published: (2024)
by: Ren, Wenze, et al.
Published: (2024)
Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference
by: Casanova, Edresson, et al.
Published: (2024)
by: Casanova, Edresson, et al.
Published: (2024)
RankUp: Boosting Semi-Supervised Regression with an Auxiliary Ranking Classifier
by: Huang, Pin-Yen, et al.
Published: (2024)
by: Huang, Pin-Yen, et al.
Published: (2024)
DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
by: Lu, Ke-Han, et al.
Published: (2024)
by: Lu, Ke-Han, et al.
Published: (2024)
Audio-Visual Speech Enhancement in Noisy Environments via Emotion-Based Contextual Cues
by: Hussain, Tassadaq, et al.
Published: (2024)
by: Hussain, Tassadaq, et al.
Published: (2024)
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
by: Ren, Wenze, et al.
Published: (2024)
by: Ren, Wenze, et al.
Published: (2024)
Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
by: Wang, Kuan-Chen, et al.
Published: (2024)
by: Wang, Kuan-Chen, et al.
Published: (2024)
Restorative Speech Enhancement: A Progressive Approach Using SE and Codec Modules
by: Chiang, Hsin-Tien, et al.
Published: (2024)
by: Chiang, Hsin-Tien, et al.
Published: (2024)
QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
by: Wang, Siyin, et al.
Published: (2025)
by: Wang, Siyin, et al.
Published: (2025)
Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network
by: Chen, Yanan, et al.
Published: (2024)
by: Chen, Yanan, et al.
Published: (2024)
LLM-Guided Reinforcement Learning for Audio-Visual Speech Enhancement
by: Chen, Chih-Ning, et al.
Published: (2026)
by: Chen, Chih-Ning, et al.
Published: (2026)
How Auditory Knowledge in LLM Backbones Shapes Audio Language Models: A Holistic Evaluation
by: Lu, Ke-Han, et al.
Published: (2026)
by: Lu, Ke-Han, et al.
Published: (2026)
Universal Discrete-Domain Speech Enhancement
by: Liu, Fei, et al.
Published: (2025)
by: Liu, Fei, et al.
Published: (2025)
GAP-URGENet: A Generative-Predictive Fusion Framework for Universal Speech Enhancement
by: Rong, Xiaobin, et al.
Published: (2026)
by: Rong, Xiaobin, et al.
Published: (2026)
Similar Items
-
Universal Speech Enhancement with Regression and Generative Mamba
by: Chao, Rong, et al.
Published: (2025) -
Robust Speech Recognition with Schrödinger Bridge-Based Speech Enhancement
by: Nasretdinov, Rauf, et al.
Published: (2025) -
A Study on Incorporating Whisper for Robust Speech Assessment
by: Zezario, Ryandhimas E., et al.
Published: (2023) -
Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
by: Fu, Szu-Wei, et al.
Published: (2024) -
Few-Shot and Pseudo-Label Guided Speech Quality Evaluation with Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2026)