Saved in:
| Main Authors: | Valin, Jean-Marc, Büthe, Jan, Mustafa, Ahmed, Klingbeil, Michael |
|---|---|
| Format: | Preprint |
| Published: |
2022
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2212.04453 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Very Low Complexity Speech Synthesis Using Framewise Autoregressive GAN (FARGAN) with Pitch Prediction
by: Valin, Jean-Marc, et al.
Published: (2024)
by: Valin, Jean-Marc, et al.
Published: (2024)
A lightweight and robust method for blind wideband-to-fullband extension of speech
by: Büthe, Jan, et al.
Published: (2024)
by: Büthe, Jan, et al.
Published: (2024)
NoLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal Shaping
by: Büthe, Jan, et al.
Published: (2023)
by: Büthe, Jan, et al.
Published: (2023)
RADE: A Neural Codec for Transmitting Speech over HF Radio Channels
by: Rowe, David, et al.
Published: (2025)
by: Rowe, David, et al.
Published: (2025)
Real-time Stereo Speech Enhancement with Spatial-Cue Preservation based on Dual-Path Structure
by: Togami, Masahito, et al.
Published: (2024)
by: Togami, Masahito, et al.
Published: (2024)
Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder
by: Xie, Yuying, et al.
Published: (2024)
by: Xie, Yuying, et al.
Published: (2024)
Noise-Robust DSP-Assisted Neural Pitch Estimation with Very Low Complexity
by: Subramani, Krishna, et al.
Published: (2023)
by: Subramani, Krishna, et al.
Published: (2023)
Complex Recurrent Variational Autoencoder with Application to Speech Enhancement
by: Xie, Yuying, et al.
Published: (2022)
by: Xie, Yuying, et al.
Published: (2022)
Variational Autoencoder for Personalized Pathological Speech Enhancement
by: Hou, Mingchi, et al.
Published: (2025)
by: Hou, Mingchi, et al.
Published: (2025)
Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder
by: Melechovsky, Jan, et al.
Published: (2022)
by: Melechovsky, Jan, et al.
Published: (2022)
Cochleagram-based Noise Adapted Speaker Identification System for Distorted Speech
by: Ahmed, Sabbir, et al.
Published: (2025)
by: Ahmed, Sabbir, et al.
Published: (2025)
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
by: Ochiai, Tsubasa, et al.
Published: (2024)
by: Ochiai, Tsubasa, et al.
Published: (2024)
Learning Speech Representations with Variational Predictive Coding
by: Yeh, Sung-Lin, et al.
Published: (2025)
by: Yeh, Sung-Lin, et al.
Published: (2025)
Clustering of Acoustic Environments with Variational Autoencoders for Hearing Devices
by: Fiorio, Luan Vinícius, et al.
Published: (2025)
by: Fiorio, Luan Vinícius, et al.
Published: (2025)
Convolutional Variational Autoencoders for Spectrogram Compression in Automatic Speech Recognition
by: Iakovenko, Olga, et al.
Published: (2024)
by: Iakovenko, Olga, et al.
Published: (2024)
Word Error Rate Definitions and Algorithms for Long-Form Multi-talker Speech Recognition
by: von Neumann, Thilo, et al.
Published: (2025)
by: von Neumann, Thilo, et al.
Published: (2025)
SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information
by: Zhang, Xiangyu, et al.
Published: (2025)
by: Zhang, Xiangyu, et al.
Published: (2025)
Aligning Speech to Languages to Enhance Code-switching Speech Recognition
by: Liu, Hexin, et al.
Published: (2024)
by: Liu, Hexin, et al.
Published: (2024)
Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks
by: Tokala, Vikas, et al.
Published: (2024)
by: Tokala, Vikas, et al.
Published: (2024)
HighRateMOS: Sampling-Rate Aware Modeling for Speech Quality Assessment
by: Ren, Wenze, et al.
Published: (2025)
by: Ren, Wenze, et al.
Published: (2025)
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis
by: Gu, Yu, et al.
Published: (2024)
by: Gu, Yu, et al.
Published: (2024)
Efficient Sparse Coding with the Adaptive Locally Competitive Algorithm for Speech Classification
by: Bahadi, Soufiyan, et al.
Published: (2024)
by: Bahadi, Soufiyan, et al.
Published: (2024)
Perceptual Ratings Predict Speech Inversion Articulatory Kinematics in Childhood Speech Sound Disorders
by: Benway, Nina R., et al.
Published: (2025)
by: Benway, Nina R., et al.
Published: (2025)
CSSinger: End-to-End Chunkwise Streaming Singing Voice Synthesis System Based on Conditional Variational Autoencoder
by: Cui, Jianwei, et al.
Published: (2024)
by: Cui, Jianwei, et al.
Published: (2024)
Rate-Aware Learned Speech Compression
by: Xu, Jun, et al.
Published: (2025)
by: Xu, Jun, et al.
Published: (2025)
Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration Coding
by: Zheng, Rui-Chen, et al.
Published: (2025)
by: Zheng, Rui-Chen, et al.
Published: (2025)
MambaRate: Speech Quality Assessment Across Different Sampling Rates
by: Kakoulidis, Panos, et al.
Published: (2025)
by: Kakoulidis, Panos, et al.
Published: (2025)
Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens
by: Zhao, Jinzheng, et al.
Published: (2024)
by: Zhao, Jinzheng, et al.
Published: (2024)
A Neural Speech Codec for Noise Robust Speech Coding
by: Huang, Jiayi, et al.
Published: (2023)
by: Huang, Jiayi, et al.
Published: (2023)
First Steps Towards Voice Anonymization for Code-Switching Speech
by: Meyer, Sarina, et al.
Published: (2025)
by: Meyer, Sarina, et al.
Published: (2025)
Parametric Object Coding in IVAS: Efficient Coding of Multiple Audio Objects at Low Bit Rates
by: Eichenseer, Andrea, et al.
Published: (2025)
by: Eichenseer, Andrea, et al.
Published: (2025)
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
by: Haeb-Umbach, Reinhold, et al.
Published: (2025)
by: Haeb-Umbach, Reinhold, et al.
Published: (2025)
Query-Based Asymmetric Modeling with Decoupled Input-Output Rates for Speech Restoration
by: Shin, Ui-Hyeop, et al.
Published: (2025)
by: Shin, Ui-Hyeop, et al.
Published: (2025)
Speaker Attributed Automatic Speech Recognition Using Speech Aware LLMS
by: Aronowitz, Hagai, et al.
Published: (2026)
by: Aronowitz, Hagai, et al.
Published: (2026)
AmbiDrop: Array-Agnostic Speech Enhancement Using Ambisonics Encoding and Dropout-Based Learning
by: Tatarjitzky, Michael, et al.
Published: (2025)
by: Tatarjitzky, Michael, et al.
Published: (2025)
Group Relative Policy Optimization for Speech Recognition
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2025)
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2025)
Reduction of Nonlinear Distortion in Condenser Microphones Using a Simple Post-Processing Technique
by: Honzík, Petr, et al.
Published: (2024)
by: Honzík, Petr, et al.
Published: (2024)
Egonoise Resilient Source Localization and Speech Enhancement for Drones Using a Hybrid Model and Learning-Based Approach
by: Wu, Yihsuan, et al.
Published: (2025)
by: Wu, Yihsuan, et al.
Published: (2025)
Speech Quality-Based Localization of Low-Quality Speech and Text-to-Speech Synthesis Artefacts
by: Kuhlmann, Michael, et al.
Published: (2026)
by: Kuhlmann, Michael, et al.
Published: (2026)
Target Speech Extraction with Pre-trained Self-supervised Learning Models
by: Peng, Junyi, et al.
Published: (2024)
by: Peng, Junyi, et al.
Published: (2024)
Similar Items
-
Very Low Complexity Speech Synthesis Using Framewise Autoregressive GAN (FARGAN) with Pitch Prediction
by: Valin, Jean-Marc, et al.
Published: (2024) -
A lightweight and robust method for blind wideband-to-fullband extension of speech
by: Büthe, Jan, et al.
Published: (2024) -
NoLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal Shaping
by: Büthe, Jan, et al.
Published: (2023) -
RADE: A Neural Codec for Transmitting Speech over HF Radio Channels
by: Rowe, David, et al.
Published: (2025) -
Real-time Stereo Speech Enhancement with Spatial-Cue Preservation based on Dual-Path Structure
by: Togami, Masahito, et al.
Published: (2024)