Saved in:
| Main Authors: | Sultana, Subrina, Williamson, Donald S. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.04379 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AttentiveMOS: A Lightweight Attention-Only Model for Speech Quality Prediction
by: Kibria, Imran E, et al.
Published: (2024)
by: Kibria, Imran E, et al.
Published: (2024)
CORN: Co-Trained Full- And No-Reference Speech Quality Assessment
by: Manocha, Pranay, et al.
Published: (2023)
by: Manocha, Pranay, et al.
Published: (2023)
JSQA: Speech Quality Assessment with Perceptually-Inspired Contrastive Pretraining Based on JND Audio Pairs
by: Fan, Junyi, et al.
Published: (2025)
by: Fan, Junyi, et al.
Published: (2025)
Multivariate Probabilistic Assessment of Speech Quality
by: Cumlin, Fredrik, et al.
Published: (2025)
by: Cumlin, Fredrik, et al.
Published: (2025)
Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM
by: Shi, Jiatong, et al.
Published: (2025)
by: Shi, Jiatong, et al.
Published: (2025)
End-to-End Speech Recognition with Pre-trained Masked Language Model
by: Higuchi, Yosuke, et al.
Published: (2024)
by: Higuchi, Yosuke, et al.
Published: (2024)
Towards Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training
by: Yang, Yifan, et al.
Published: (2026)
by: Yang, Yifan, et al.
Published: (2026)
Calibration-Reasoning Framework for Descriptive Speech Quality Assessment
by: Kostenok, Elizaveta, et al.
Published: (2026)
by: Kostenok, Elizaveta, et al.
Published: (2026)
Speaker Disentanglement of Speech Pre-trained Model Based on Interpretability
by: Zhu, Xiaoxu, et al.
Published: (2025)
by: Zhu, Xiaoxu, et al.
Published: (2025)
SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech
by: Lin, Jingru, et al.
Published: (2024)
by: Lin, Jingru, et al.
Published: (2024)
Leveraging LLMs for Scalable Non-intrusive Speech Quality Assessment
by: Cumlin, Fredrik, et al.
Published: (2025)
by: Cumlin, Fredrik, et al.
Published: (2025)
Target Speech Extraction with Pre-trained Self-supervised Learning Models
by: Peng, Junyi, et al.
Published: (2024)
by: Peng, Junyi, et al.
Published: (2024)
Progressive Residual Extraction based Pre-training for Speech Representation Learning
by: Wang, Tianrui, et al.
Published: (2024)
by: Wang, Tianrui, et al.
Published: (2024)
Unifying Listener Scoring Scales: Comparison Learning Framework for Speech Quality Assessment and Continuous Speech Emotion Recognition
by: Hu, Cheng-Hung, et al.
Published: (2025)
by: Hu, Cheng-Hung, et al.
Published: (2025)
From the perspective of perceptual speech quality: The robustness of frequency bands to noise
by: Fan, Junyi, et al.
Published: (2025)
by: Fan, Junyi, et al.
Published: (2025)
SCOREQ: Speech Quality Assessment with Contrastive Regression
by: Ragano, Alessandro, et al.
Published: (2024)
by: Ragano, Alessandro, et al.
Published: (2024)
A Study of Data Selection Strategies for Pre-training Self-Supervised Speech Models
by: Whetten, Ryan, et al.
Published: (2026)
by: Whetten, Ryan, et al.
Published: (2026)
Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks
by: Ma, Duo, et al.
Published: (2024)
by: Ma, Duo, et al.
Published: (2024)
Speaker-Conditioned Phrase Break Prediction for Text-to-Speech with Phoneme-Level Pre-trained Language Model
by: Yang, Dong, et al.
Published: (2025)
by: Yang, Dong, et al.
Published: (2025)
Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders
by: Sun, Xingwei, et al.
Published: (2025)
by: Sun, Xingwei, et al.
Published: (2025)
Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
by: Wang, Kuan-Chen, et al.
Published: (2024)
by: Wang, Kuan-Chen, et al.
Published: (2024)
Self-Supervised Speech Quality Assessment (S3QA): Leveraging Speech Foundation Models for a Scalable Speech Quality Metric
by: Ogg, Mattson, et al.
Published: (2025)
by: Ogg, Mattson, et al.
Published: (2025)
HighRateMOS: Sampling-Rate Aware Modeling for Speech Quality Assessment
by: Ren, Wenze, et al.
Published: (2025)
by: Ren, Wenze, et al.
Published: (2025)
Using RLHF to align speech enhancement approaches to mean-opinion quality scores
by: Kumar, Anurag, et al.
Published: (2024)
by: Kumar, Anurag, et al.
Published: (2024)
Speech Quality-Based Localization of Low-Quality Speech and Text-to-Speech Synthesis Artefacts
by: Kuhlmann, Michael, et al.
Published: (2026)
by: Kuhlmann, Michael, et al.
Published: (2026)
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy
by: Wu, Wenxuan, et al.
Published: (2024)
by: Wu, Wenxuan, et al.
Published: (2024)
Assessing the Impact of Noise and Speech Enhancement on the Intelligibility of Speech Codecs
by: Behringer, Lyonel, et al.
Published: (2026)
by: Behringer, Lyonel, et al.
Published: (2026)
French Listening Tests for the Assessment of Intelligibility, Quality, and Identity of Body-Conducted Speech Enhancement
by: Joubaud, Thomas, et al.
Published: (2025)
by: Joubaud, Thomas, et al.
Published: (2025)
MOS-Bias: From Hidden Gender Bias to Gender-Aware Speech Quality Assessment
by: Ren, Wenze, et al.
Published: (2026)
by: Ren, Wenze, et al.
Published: (2026)
Universal Preference-Score-based Pairwise Speech Quality Assessment
by: Shi, Yu-Fei, et al.
Published: (2025)
by: Shi, Yu-Fei, et al.
Published: (2025)
Post-training for Deepfake Speech Detection
by: Ge, Wanying, et al.
Published: (2025)
by: Ge, Wanying, et al.
Published: (2025)
PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models
by: Feng, Tiantian, et al.
Published: (2023)
by: Feng, Tiantian, et al.
Published: (2023)
Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
by: Chen, Zhengyang, et al.
Published: (2024)
by: Chen, Zhengyang, et al.
Published: (2024)
A Neural Speech Codec for Noise Robust Speech Coding
by: Huang, Jiayi, et al.
Published: (2023)
by: Huang, Jiayi, et al.
Published: (2023)
MambaRate: Speech Quality Assessment Across Different Sampling Rates
by: Kakoulidis, Panos, et al.
Published: (2025)
by: Kakoulidis, Panos, et al.
Published: (2025)
Neural Encoding Detection is Not All You Need for Synthetic Speech Detection
by: Cuccovillo, Luca, et al.
Published: (2026)
by: Cuccovillo, Luca, et al.
Published: (2026)
On Speech Pre-emphasis as a Simple and Inexpensive Method to Boost Speech Enhancement
by: López-Espejo, Iván, et al.
Published: (2024)
by: López-Espejo, Iván, et al.
Published: (2024)
Investigation of Speech and Noise Latent Representations in Single-channel VAE-based Speech Enhancement
by: Li, Jiatong, et al.
Published: (2025)
by: Li, Jiatong, et al.
Published: (2025)
Binaural Localization Model for Speech in Noise
by: Tokala, Vikas, et al.
Published: (2025)
by: Tokala, Vikas, et al.
Published: (2025)
Scaling Speech-Text Pre-training with Synthetic Interleaved Data
by: Zeng, Aohan, et al.
Published: (2024)
by: Zeng, Aohan, et al.
Published: (2024)
Similar Items
-
AttentiveMOS: A Lightweight Attention-Only Model for Speech Quality Prediction
by: Kibria, Imran E, et al.
Published: (2024) -
CORN: Co-Trained Full- And No-Reference Speech Quality Assessment
by: Manocha, Pranay, et al.
Published: (2023) -
JSQA: Speech Quality Assessment with Perceptually-Inspired Contrastive Pretraining Based on JND Audio Pairs
by: Fan, Junyi, et al.
Published: (2025) -
Multivariate Probabilistic Assessment of Speech Quality
by: Cumlin, Fredrik, et al.
Published: (2025) -
Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM
by: Shi, Jiatong, et al.
Published: (2025)