:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hinrichs, Reemt, Damara, Muhamad Fadli, Preihs, Stephan, Ostermann, Jörn
Format:	Preprint
Published:	2026
Subjects:	Sound
Online Access:	https://arxiv.org/abs/2601.09461
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Pruning-aware Loss Functions for STOI-Optimized Pruned Recurrent Autoencoders for the Compression of the Stimulation Patterns of Cochlear Implants at Zero Delay
by: Hinrichs, Reemt, et al.
Published: (2025)

A Dataset for Automatic Vocal Mode Classification
by: Hinrichs, Reemt, et al.
Published: (2026)

LSTMSE-Net: Long Short Term Speech Enhancement Network for Audio-visual Speech Enhancement
by: Jain, Arnav, et al.
Published: (2024)

Enhancing Kurdish Text-to-Speech with Native Corpus Training: A High-Quality WaveGlow Vocoder Approach
by: Abdullah, Abdulhady Abas, et al.
Published: (2024)

Future Full-Ocean Deep SSPs Prediction based on Hierarchical Long Short-Term Memory Neural Networks
by: Lu, Jiajun, et al.
Published: (2023)

Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
by: Inoue, Sho, et al.
Published: (2024)

Combined Generative and Predictive Modeling for Speech Super-resolution
by: Wang, Heming, et al.
Published: (2024)

Non-Invasive Suicide Risk Prediction Through Speech Analysis
by: Amiriparian, Shahin, et al.
Published: (2024)

Perceived Femininity in Singing Voice: Analysis and Prediction
by: Kong, Yuexuan, et al.
Published: (2025)

Harmonic Detection from Noisy Speech with Auditory Frame Gain for Intelligibility Enhancement
by: Queiroz, A., et al.
Published: (2024)

Investigation of Zero-shot Text-to-Speech Models for Enhancing Short-Utterance Speaker Verification
by: Zhao, Yiyang, et al.
Published: (2025)

Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction
by: Zhang, Leying, et al.
Published: (2025)

The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
by: Huang, Wen-Chin, et al.
Published: (2024)

Faster Speech-LLaMA Inference with Multi-token Prediction
by: Raj, Desh, et al.
Published: (2024)

Latent-Domain Predictive Neural Speech Coding
by: Jiang, Xue, et al.
Published: (2022)

A Multi-decoder Neural Tracking Method for Accurately Predicting Speech Intelligibility
by: Sonck, Rien, et al.
Published: (2026)

Time vs. Layer: Locating Predictive Cues for Dysarthric Speech Descriptors in wav2vec 2.0
by: Engert, Natalie, et al.
Published: (2026)

DeepGESI: A Non-Intrusive Objective Evaluation Model for Predicting Speech Intelligibility in Hearing-Impaired Listeners
by: Luo, Wenyu, et al.
Published: (2025)

Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models
by: Chen, Li-Wei, et al.
Published: (2024)

Representing Speech Through Autoregressive Prediction of Cochlear Tokens
by: Tuckute, Greta, et al.
Published: (2025)

Multi-Utterance Speech Separation and Association Trained on Short Segments
by: Wang, Yuzhu, et al.
Published: (2025)

MTP-S2UT: Enhancing Speech-to-Speech Translation Quality with Multi-token Prediction
by: Wang, Jianjin, et al.
Published: (2025)

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
by: Xu, Mohan, et al.
Published: (2024)

GAP-URGENet: A Generative-Predictive Fusion Framework for Universal Speech Enhancement
by: Rong, Xiaobin, et al.
Published: (2026)

Multi-Step Prediction and Control of Hierarchical Emotion Distribution in Text-to-Speech Synthesis
by: Inoue, Sho, et al.
Published: (2025)

KALL-E:Autoregressive Speech Synthesis with Next-Distribution Prediction
by: Xia, Kangxiang, et al.
Published: (2024)

Stage-Wise and Prior-Aware Neural Speech Phase Prediction
by: Liu, Fei, et al.
Published: (2024)

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders
by: Shi, Hao, et al.
Published: (2023)

Contextualized Automatic Speech Recognition with Dynamic Vocabulary Prediction and Activation
by: Lin, Zhennan, et al.
Published: (2025)

No Audiogram: Leveraging Existing Scores for Personalized Speech Intelligibility Prediction
by: Zhou, Haoshuai, et al.
Published: (2025)

Low-Latency Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks
by: Ai, Yang, et al.
Published: (2024)

Dual-View Predictive Diffusion: Lightweight Speech Enhancement via Spectrogram-Image Synergy
by: Xue, Ke, et al.
Published: (2026)

Non-Intrusive Binaural Speech Intelligibility Prediction Using Mamba for Hearing-Impaired Listeners
by: Yamamoto, Katsuhiko, et al.
Published: (2025)

Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features
by: Tsunoo, Emiru, et al.
Published: (2024)

Unveiling the Best Practices for Applying Speech Foundation Models to Speech Intelligibility Prediction for Hearing-Impaired People
by: Zhou, Haoshuai, et al.
Published: (2025)

Articulatory Feature Prediction from Surface EMG during Speech Production
by: Lee, Jihwan, et al.
Published: (2025)

Causal Self-supervised Pretrained Frontend with Predictive Code for Speech Separation
by: Wang, Wupeng, et al.
Published: (2025)

Selection of Layers from Self-supervised Learning Models for Predicting Mean-Opinion-Score of Speech
by: Liang, Xinyu, et al.
Published: (2025)

Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids
by: Zezario, Ryandhimas E., et al.
Published: (2025)

Speech Enhancement with Dual-path Multi-Channel Linear Prediction Filter and Multi-norm Beamforming
by: Qin, Chengyuan, et al.
Published: (2025)