:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Valin, Jean-Marc, Büthe, Jan, Mustafa, Ahmed, Klingbeil, Michael
Format:	Preprint
Published:	2022
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2212.04453
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Very Low Complexity Speech Synthesis Using Framewise Autoregressive GAN (FARGAN) with Pitch Prediction
by: Valin, Jean-Marc, et al.
Published: (2024)

A lightweight and robust method for blind wideband-to-fullband extension of speech
by: Büthe, Jan, et al.
Published: (2024)

NoLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal Shaping
by: Büthe, Jan, et al.
Published: (2023)

RADE: A Neural Codec for Transmitting Speech over HF Radio Channels
by: Rowe, David, et al.
Published: (2025)

Real-time Stereo Speech Enhancement with Spatial-Cue Preservation based on Dual-Path Structure
by: Togami, Masahito, et al.
Published: (2024)

Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder
by: Xie, Yuying, et al.
Published: (2024)

Noise-Robust DSP-Assisted Neural Pitch Estimation with Very Low Complexity
by: Subramani, Krishna, et al.
Published: (2023)

Complex Recurrent Variational Autoencoder with Application to Speech Enhancement
by: Xie, Yuying, et al.
Published: (2022)

Variational Autoencoder for Personalized Pathological Speech Enhancement
by: Hou, Mingchi, et al.
Published: (2025)

Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder
by: Melechovsky, Jan, et al.
Published: (2022)

Cochleagram-based Noise Adapted Speaker Identification System for Distorted Speech
by: Ahmed, Sabbir, et al.
Published: (2025)

Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
by: Ochiai, Tsubasa, et al.
Published: (2024)

Learning Speech Representations with Variational Predictive Coding
by: Yeh, Sung-Lin, et al.
Published: (2025)

Clustering of Acoustic Environments with Variational Autoencoders for Hearing Devices
by: Fiorio, Luan Vinícius, et al.
Published: (2025)

Convolutional Variational Autoencoders for Spectrogram Compression in Automatic Speech Recognition
by: Iakovenko, Olga, et al.
Published: (2024)

Word Error Rate Definitions and Algorithms for Long-Form Multi-talker Speech Recognition
by: von Neumann, Thilo, et al.
Published: (2025)

SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information
by: Zhang, Xiangyu, et al.
Published: (2025)

Aligning Speech to Languages to Enhance Code-switching Speech Recognition
by: Liu, Hexin, et al.
Published: (2024)

Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks
by: Tokala, Vikas, et al.
Published: (2024)

HighRateMOS: Sampling-Rate Aware Modeling for Speech Quality Assessment
by: Ren, Wenze, et al.
Published: (2025)

DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis
by: Gu, Yu, et al.
Published: (2024)

Efficient Sparse Coding with the Adaptive Locally Competitive Algorithm for Speech Classification
by: Bahadi, Soufiyan, et al.
Published: (2024)

Perceptual Ratings Predict Speech Inversion Articulatory Kinematics in Childhood Speech Sound Disorders
by: Benway, Nina R., et al.
Published: (2025)

CSSinger: End-to-End Chunkwise Streaming Singing Voice Synthesis System Based on Conditional Variational Autoencoder
by: Cui, Jianwei, et al.
Published: (2024)

Rate-Aware Learned Speech Compression
by: Xu, Jun, et al.
Published: (2025)

Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration Coding
by: Zheng, Rui-Chen, et al.
Published: (2025)

MambaRate: Speech Quality Assessment Across Different Sampling Rates
by: Kakoulidis, Panos, et al.
Published: (2025)

Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens
by: Zhao, Jinzheng, et al.
Published: (2024)

A Neural Speech Codec for Noise Robust Speech Coding
by: Huang, Jiayi, et al.
Published: (2023)

First Steps Towards Voice Anonymization for Code-Switching Speech
by: Meyer, Sarina, et al.
Published: (2025)

Parametric Object Coding in IVAS: Efficient Coding of Multiple Audio Objects at Low Bit Rates
by: Eichenseer, Andrea, et al.
Published: (2025)

Microphone Array Signal Processing and Deep Learning for Speech Enhancement
by: Haeb-Umbach, Reinhold, et al.
Published: (2025)

Query-Based Asymmetric Modeling with Decoupled Input-Output Rates for Speech Restoration
by: Shin, Ui-Hyeop, et al.
Published: (2025)

Speaker Attributed Automatic Speech Recognition Using Speech Aware LLMS
by: Aronowitz, Hagai, et al.
Published: (2026)

AmbiDrop: Array-Agnostic Speech Enhancement Using Ambisonics Encoding and Dropout-Based Learning
by: Tatarjitzky, Michael, et al.
Published: (2025)

Group Relative Policy Optimization for Speech Recognition
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2025)

Reduction of Nonlinear Distortion in Condenser Microphones Using a Simple Post-Processing Technique
by: Honzík, Petr, et al.
Published: (2024)

Egonoise Resilient Source Localization and Speech Enhancement for Drones Using a Hybrid Model and Learning-Based Approach
by: Wu, Yihsuan, et al.
Published: (2025)

Speech Quality-Based Localization of Low-Quality Speech and Text-to-Speech Synthesis Artefacts
by: Kuhlmann, Michael, et al.
Published: (2026)

Target Speech Extraction with Pre-trained Self-supervised Learning Models
by: Peng, Junyi, et al.
Published: (2024)