Saved in:
| Main Author: | Tsao, Yu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.01889 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MOS-Bias: From Hidden Gender Bias to Gender-Aware Speech Quality Assessment
by: Ren, Wenze, et al.
Published: (2026)
by: Ren, Wenze, et al.
Published: (2026)
TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations
by: Gao, Xiaoxue, et al.
Published: (2024)
by: Gao, Xiaoxue, et al.
Published: (2024)
A Study on Incorporating Whisper for Robust Speech Assessment
by: Zezario, Ryandhimas E., et al.
Published: (2023)
by: Zezario, Ryandhimas E., et al.
Published: (2023)
A Study on Speech Assessment with Visual Cues
by: Ahmed, Shafique, et al.
Published: (2025)
by: Ahmed, Shafique, et al.
Published: (2025)
Speech Intelligibility Assessment with Uncertainty-Aware Whisper Embeddings and sLSTM
by: Zezario, Ryandhimas E., et al.
Published: (2025)
by: Zezario, Ryandhimas E., et al.
Published: (2025)
A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2024)
by: Zezario, Ryandhimas E., et al.
Published: (2024)
Unsupervised Face-Masked Speech Enhancement Using Generative Adversarial Networks With Human-in-the-Loop Assessment Metrics
by: Wang, Syu-Siang, et al.
Published: (2024)
by: Wang, Syu-Siang, et al.
Published: (2024)
CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech
by: Wang, Helin, et al.
Published: (2025)
by: Wang, Helin, et al.
Published: (2025)
HighRateMOS: Sampling-Rate Aware Modeling for Speech Quality Assessment
by: Ren, Wenze, et al.
Published: (2025)
by: Ren, Wenze, et al.
Published: (2025)
Audio-Visual Speech Enhancement in Noisy Environments via Emotion-Based Contextual Cues
by: Hussain, Tassadaq, et al.
Published: (2024)
by: Hussain, Tassadaq, et al.
Published: (2024)
EffortNet: A Deep Learning Framework for Objective Assessment of Speech Enhancement Technologies Using EEG-Based Alpha Oscillations
by: Sung, Ching-Chih, et al.
Published: (2025)
by: Sung, Ching-Chih, et al.
Published: (2025)
SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models
by: Yin, Chun, et al.
Published: (2024)
by: Yin, Chun, et al.
Published: (2024)
Tracking Listener Attention: Gaze-Guided Audio-Visual Speech Enhancement Framework
by: Yang, Hsiang-Cheng, et al.
Published: (2026)
by: Yang, Hsiang-Cheng, et al.
Published: (2026)
Universal Speech Enhancement with Regression and Generative Mamba
by: Chao, Rong, et al.
Published: (2025)
by: Chao, Rong, et al.
Published: (2025)
An Investigation on Combining Geometry and Consistency Constraints into Phase Estimation for Speech Enhancement
by: Ho, Chun-Wei, et al.
Published: (2025)
by: Ho, Chun-Wei, et al.
Published: (2025)
Few-Shot and Pseudo-Label Guided Speech Quality Evaluation with Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2026)
by: Zezario, Ryandhimas E., et al.
Published: (2026)
STSM-FiLM: A FiLM-Conditioned Neural Architecture for Time-Scale Modification of Speech
by: Wisnu, Dyah A. M. G., et al.
Published: (2025)
by: Wisnu, Dyah A. M. G., et al.
Published: (2025)
Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?
by: Aldeneh, Zakaria, et al.
Published: (2024)
by: Aldeneh, Zakaria, et al.
Published: (2024)
Towards General Discrete Speech Codec for Complex Acoustic Environments: A Study of Reconstruction and Downstream Task Consistency
by: Wang, Haoran, et al.
Published: (2025)
by: Wang, Haoran, et al.
Published: (2025)
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model
by: Zezario, Ryandhimas E., et al.
Published: (2023)
by: Zezario, Ryandhimas E., et al.
Published: (2023)
Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids
by: Zezario, Ryandhimas E., et al.
Published: (2025)
by: Zezario, Ryandhimas E., et al.
Published: (2025)
FastEnhancer: Speed-Optimized Streaming Neural Speech Enhancement
by: Ahn, Sunghwan, et al.
Published: (2025)
by: Ahn, Sunghwan, et al.
Published: (2025)
A Study on Zero-Shot Non-Intrusive Speech Intelligibility for Hearing Aids Using Large Language Models
by: Zezario, Ryandhimas E., et al.
Published: (2025)
by: Zezario, Ryandhimas E., et al.
Published: (2025)
CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset
by: Chen, Xuanjun, et al.
Published: (2025)
by: Chen, Xuanjun, et al.
Published: (2025)
Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features
by: Zezario, Ryandhimas E., et al.
Published: (2021)
by: Zezario, Ryandhimas E., et al.
Published: (2021)
Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
by: Ren, Wenze, et al.
Published: (2024)
by: Ren, Wenze, et al.
Published: (2024)
Visual-Informed Speech Enhancement Using Attention-Based Beamforming
by: Liu, Chihyun, et al.
Published: (2026)
by: Liu, Chihyun, et al.
Published: (2026)
Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
by: Wang, Kuan-Chen, et al.
Published: (2024)
by: Wang, Kuan-Chen, et al.
Published: (2024)
Multivariate Probabilistic Assessment of Speech Quality
by: Cumlin, Fredrik, et al.
Published: (2025)
by: Cumlin, Fredrik, et al.
Published: (2025)
The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
by: Huang, Wen-Chin, et al.
Published: (2024)
by: Huang, Wen-Chin, et al.
Published: (2024)
Leveraging Self-Supervised Audio-Visual Pretrained Models to Improve Vocoded Speech Intelligibility in Cochlear Implant Simulation
by: Lai, Richard Lee, et al.
Published: (2023)
by: Lai, Richard Lee, et al.
Published: (2023)
Exploiting Consistency-Preserving Loss and Perceptual Contrast Stretching to Boost SSL-based Speech Enhancement
by: Khan, Muhammad Salman, et al.
Published: (2024)
by: Khan, Muhammad Salman, et al.
Published: (2024)
QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
by: Wang, Siyin, et al.
Published: (2025)
by: Wang, Siyin, et al.
Published: (2025)
EMO-Codec: An In-Depth Look at Emotion Preservation capacity of Legacy and Neural Codec Models With Subjective and Objective Evaluations
by: Ren, Wenze, et al.
Published: (2024)
by: Ren, Wenze, et al.
Published: (2024)
Leveraging Mamba with Full-Face Vision for Audio-Visual Speech Enhancement
by: Chao, Rong, et al.
Published: (2025)
by: Chao, Rong, et al.
Published: (2025)
HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids
by: Wisnu, Dyah A. M. G., et al.
Published: (2024)
by: Wisnu, Dyah A. M. G., et al.
Published: (2024)
Towards Environmental Preference Based Speech Enhancement For Individualised Multi-Modal Hearing Aids
by: Kirton-Wingate, Jasper, et al.
Published: (2024)
by: Kirton-Wingate, Jasper, et al.
Published: (2024)
Linguistic Knowledge Transfer Learning for Speech Enhancement
by: Hung, Kuo-Hsuan, et al.
Published: (2025)
by: Hung, Kuo-Hsuan, et al.
Published: (2025)
Coupling Speech Encoders with Downstream Text Models
by: Chelba, Ciprian, et al.
Published: (2024)
by: Chelba, Ciprian, et al.
Published: (2024)
Universal Preference-Score-based Pairwise Speech Quality Assessment
by: Shi, Yu-Fei, et al.
Published: (2025)
by: Shi, Yu-Fei, et al.
Published: (2025)
Similar Items
-
MOS-Bias: From Hidden Gender Bias to Gender-Aware Speech Quality Assessment
by: Ren, Wenze, et al.
Published: (2026) -
TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations
by: Gao, Xiaoxue, et al.
Published: (2024) -
A Study on Incorporating Whisper for Robust Speech Assessment
by: Zezario, Ryandhimas E., et al.
Published: (2023) -
A Study on Speech Assessment with Visual Cues
by: Ahmed, Shafique, et al.
Published: (2025) -
Speech Intelligibility Assessment with Uncertainty-Aware Whisper Embeddings and sLSTM
by: Zezario, Ryandhimas E., et al.
Published: (2025)