:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sultana, Subrina, Williamson, Donald S.
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2411.04379
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AttentiveMOS: A Lightweight Attention-Only Model for Speech Quality Prediction
by: Kibria, Imran E, et al.
Published: (2024)

CORN: Co-Trained Full- And No-Reference Speech Quality Assessment
by: Manocha, Pranay, et al.
Published: (2023)

JSQA: Speech Quality Assessment with Perceptually-Inspired Contrastive Pretraining Based on JND Audio Pairs
by: Fan, Junyi, et al.
Published: (2025)

Multivariate Probabilistic Assessment of Speech Quality
by: Cumlin, Fredrik, et al.
Published: (2025)

Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM
by: Shi, Jiatong, et al.
Published: (2025)

End-to-End Speech Recognition with Pre-trained Masked Language Model
by: Higuchi, Yosuke, et al.
Published: (2024)

Towards Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training
by: Yang, Yifan, et al.
Published: (2026)

Calibration-Reasoning Framework for Descriptive Speech Quality Assessment
by: Kostenok, Elizaveta, et al.
Published: (2026)

Speaker Disentanglement of Speech Pre-trained Model Based on Interpretability
by: Zhu, Xiaoxu, et al.
Published: (2025)

SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech
by: Lin, Jingru, et al.
Published: (2024)

Leveraging LLMs for Scalable Non-intrusive Speech Quality Assessment
by: Cumlin, Fredrik, et al.
Published: (2025)

Target Speech Extraction with Pre-trained Self-supervised Learning Models
by: Peng, Junyi, et al.
Published: (2024)

Progressive Residual Extraction based Pre-training for Speech Representation Learning
by: Wang, Tianrui, et al.
Published: (2024)

Unifying Listener Scoring Scales: Comparison Learning Framework for Speech Quality Assessment and Continuous Speech Emotion Recognition
by: Hu, Cheng-Hung, et al.
Published: (2025)

From the perspective of perceptual speech quality: The robustness of frequency bands to noise
by: Fan, Junyi, et al.
Published: (2025)

SCOREQ: Speech Quality Assessment with Contrastive Regression
by: Ragano, Alessandro, et al.
Published: (2024)

A Study of Data Selection Strategies for Pre-training Self-Supervised Speech Models
by: Whetten, Ryan, et al.
Published: (2026)

Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks
by: Ma, Duo, et al.
Published: (2024)

Speaker-Conditioned Phrase Break Prediction for Text-to-Speech with Phoneme-Level Pre-trained Language Model
by: Yang, Dong, et al.
Published: (2025)

Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders
by: Sun, Xingwei, et al.
Published: (2025)

Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
by: Wang, Kuan-Chen, et al.
Published: (2024)

Self-Supervised Speech Quality Assessment (S3QA): Leveraging Speech Foundation Models for a Scalable Speech Quality Metric
by: Ogg, Mattson, et al.
Published: (2025)

HighRateMOS: Sampling-Rate Aware Modeling for Speech Quality Assessment
by: Ren, Wenze, et al.
Published: (2025)

Using RLHF to align speech enhancement approaches to mean-opinion quality scores
by: Kumar, Anurag, et al.
Published: (2024)

Speech Quality-Based Localization of Low-Quality Speech and Text-to-Speech Synthesis Artefacts
by: Kuhlmann, Michael, et al.
Published: (2026)

Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy
by: Wu, Wenxuan, et al.
Published: (2024)

Assessing the Impact of Noise and Speech Enhancement on the Intelligibility of Speech Codecs
by: Behringer, Lyonel, et al.
Published: (2026)

French Listening Tests for the Assessment of Intelligibility, Quality, and Identity of Body-Conducted Speech Enhancement
by: Joubaud, Thomas, et al.
Published: (2025)

MOS-Bias: From Hidden Gender Bias to Gender-Aware Speech Quality Assessment
by: Ren, Wenze, et al.
Published: (2026)

Universal Preference-Score-based Pairwise Speech Quality Assessment
by: Shi, Yu-Fei, et al.
Published: (2025)

Post-training for Deepfake Speech Detection
by: Ge, Wanying, et al.
Published: (2025)

PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models
by: Feng, Tiantian, et al.
Published: (2023)

Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
by: Chen, Zhengyang, et al.
Published: (2024)

A Neural Speech Codec for Noise Robust Speech Coding
by: Huang, Jiayi, et al.
Published: (2023)

MambaRate: Speech Quality Assessment Across Different Sampling Rates
by: Kakoulidis, Panos, et al.
Published: (2025)

Neural Encoding Detection is Not All You Need for Synthetic Speech Detection
by: Cuccovillo, Luca, et al.
Published: (2026)

On Speech Pre-emphasis as a Simple and Inexpensive Method to Boost Speech Enhancement
by: López-Espejo, Iván, et al.
Published: (2024)

Investigation of Speech and Noise Latent Representations in Single-channel VAE-based Speech Enhancement
by: Li, Jiatong, et al.
Published: (2025)

Binaural Localization Model for Speech in Noise
by: Tokala, Vikas, et al.
Published: (2025)

Scaling Speech-Text Pre-training with Synthetic Interleaved Data
by: Zeng, Aohan, et al.
Published: (2024)