:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Webber, Jacob J, Watts, Oliver, Henter, Gustav Eje, Williams, Jennifer, King, Simon
Format:	Preprint
Published:	2024
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2409.14919
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

When Voice Matters: Evidence of Gender Disparity in Positional Bias of SpeechLLMs
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2025)

Do Bias Benchmarks Generalise? Evidence from Voice-based Evaluation of Gender Bias in SpeechLLMs
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2025)

Speak Your Mind: The Speech Continuation Task as a Probe of Voice-Based Model Bias
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2025)

The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2026)

VoXtream: Full-Stream Text-to-Speech with Extremely Low Latency
by: Torgashov, Nikita, et al.
Published: (2025)

VoXtream2: Full-stream TTS with dynamic speaking rate control
by: Torgashov, Nikita, et al.
Published: (2026)

HiFi-Glot: High-Fidelity Neural Formant Synthesis with Differentiable Resonant Filters
by: Gu, Yicheng, et al.
Published: (2024)

Comparator Loss: An Ordinal Contrastive Loss to Derive a Severity Score for Speech-based Health Monitoring
by: Webber, Jacob J, et al.
Published: (2025)

Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction
by: Guichoux, Téo, et al.
Published: (2025)

Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals
by: Seki, Kentaro, et al.
Published: (2024)

RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
by: Chen, Wei, et al.
Published: (2024)

RAVE for Speech: Efficient Voice Conversion at High Sampling Rates
by: Bargum, Anders R., et al.
Published: (2024)

Voice-ENHANCE: Speech Restoration using a Diffusion-based Voice Conversion Framework
by: Byun, Kyungguen, et al.
Published: (2025)

ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
by: Zhu, Xinfa, et al.
Published: (2025)

Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements
by: Dhar, Sandipan, et al.
Published: (2025)

Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
by: Sha, Binzhu, et al.
Published: (2023)

Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion
by: Chen, Zhengyang, et al.
Published: (2024)

On the Generation and Removal of Speaker Adversarial Perturbation for Voice-Privacy Protection
by: Guo, Chenyang, et al.
Published: (2024)

StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)

SOVA-Bench: Benchmarking the Speech Conversation Ability for LLM-based Voice Assistant
by: Hou, Yixuan, et al.
Published: (2025)

Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion
by: Zhang, Xueyao, et al.
Published: (2023)

Generating Novel and Realistic Speakers for Voice Conversion
by: Chen, Meiying Melissa, et al.
Published: (2025)

LatentVoiceGrad: Nonparallel Voice Conversion with Latent Diffusion/Flow-Matching Models
by: Kameoka, Hirokazu, et al.
Published: (2025)

VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
by: Kameoka, Hirokazu, et al.
Published: (2020)

Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
by: Zhou, Wangjin, et al.
Published: (2024)

Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model
by: Du, Zongyang, et al.
Published: (2024)

Enhancing Polyglot Voices by Leveraging Cross-Lingual Fine-Tuning in Any-to-One Voice Conversion
by: Ruggiero, Giuseppe, et al.
Published: (2024)

OneVoice: One Model, Triple Scenarios-Towards Unified Zero-shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2026)

Residual Speaker Representation for One-Shot Voice Conversion
by: Xu, Le, et al.
Published: (2023)

StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)

Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
by: Kim, Heeseung, et al.
Published: (2025)

Collective Learning Mechanism based Optimal Transport Generative Adversarial Network for Non-parallel Voice Conversion
by: Dhar, Sandipan, et al.
Published: (2025)

REWIND: Speech Time Reversal for Enhancing Speaker Representations in Diffusion-based Voice Conversion
by: Biyani, Ishan D., et al.
Published: (2025)

VC-ENHANCE: Speech Restoration with Integrated Noise Suppression and Voice Conversion
by: Byun, Kyungguen, et al.
Published: (2024)

SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
by: Saito, Yuki, et al.
Published: (2024)

An Extensive Analysis of the Singing Voice Conversion Challenge 2025 Evaluation Results
by: Violeta, Lester Phillip, et al.
Published: (2025)

End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
by: Kang, Wonjune, et al.
Published: (2022)

FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion
by: Ferreira, Alef Iury Siqueira, et al.
Published: (2025)

Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
by: Kim, Tae-Woo, et al.
Published: (2022)

Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion
by: Li, Ruiqi, et al.
Published: (2024)