:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Uddin, Majbah, Huynh, Nathan, Vidal, Jose M, Taaffe, Kevin M, Fredendall, Lawrence D, Greenstein, Joel S
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing Computation and Language Machine Learning Sound
Online Access:	https://arxiv.org/abs/2402.03369
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Loudspeaker Beamforming to Enhance Speech Recognition Performance of Voice Driven Applications
by: de Groot, Dimme, et al.
Published: (2025)

VoiceGRPO: Modern MoE Transformers with Group Relative Policy Optimization GRPO for AI Voice Health Care Applications on Voice Pathology Detection
by: Togootogtokh, Enkhtogtokh, et al.
Published: (2025)

TidyVoice 2026 Challenge Evaluation Plan
by: Farhadipour, Aref, et al.
Published: (2026)

Descriptor:: Extended-Length Audio Dataset for Synthetic Voice Detection and Speaker Recognition (ELAD-SVDSR)
by: Vijaykumar, Rahul, et al.
Published: (2025)

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing
by: Zheng, Zhisheng, et al.
Published: (2025)

Vo-Ve: An Explainable Voice-Vector for Speaker Identity Evaluation
by: Lee, Jaejun, et al.
Published: (2025)

Development of the Listening in Spatialized Noise-Sentences (LiSN-S) Test in Brazilian Portuguese: Presentation Software, Speech Stimuli, and Sentence Equivalence
by: Masiero, Bruno S., et al.
Published: (2024)

PERSONA: An Application for Emotion Recognition, Gender Recognition and Age Estimation
by: Koshal, Devyani, et al.
Published: (2024)

An Extensive Analysis of the Singing Voice Conversion Challenge 2025 Evaluation Results
by: Violeta, Lester Phillip, et al.
Published: (2025)

End-to-End Integration of Speech Emotion Recognition with Voice Activity Detection using Self-Supervised Learning Features
by: Yamashita, Natsuo, et al.
Published: (2024)

Automatic Voice Classification Of Autistic Subjects
by: Vacca, Jessica, et al.
Published: (2024)

EchoVoices: Preserving Generational Voices and Memories for Seniors and Children
by: Xu, Haiying, et al.
Published: (2025)

EMALG: An Enhanced Mandarin Lombard Grid Corpus with Meaningful Sentences
by: Li, Baifeng, et al.
Published: (2023)

Auden-Voice: General-Purpose Voice Encoder for Speech and Language Understanding
by: Huo, Mingyue, et al.
Published: (2025)

Human Voice is Unique
by: Singh, Rita, et al.
Published: (2025)

AdaProj: Adaptively Scaled Angular Margin Subspace Projections for Anomalous Sound Detection with Auxiliary Classification Tasks
by: Wilkinghoff, Kevin
Published: (2024)

StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)

Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals
by: Seki, Kentaro, et al.
Published: (2024)

LatentVoiceGrad: Nonparallel Voice Conversion with Latent Diffusion/Flow-Matching Models
by: Kameoka, Hirokazu, et al.
Published: (2025)

VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
by: Kameoka, Hirokazu, et al.
Published: (2020)

Voice-ENHANCE: Speech Restoration using a Diffusion-based Voice Conversion Framework
by: Byun, Kyungguen, et al.
Published: (2025)

A Transversal Study of Fundamental Frequency Contours in Parkinsonian Voices
by: Rodriguez-Perez, Pablo, et al.
Published: (2024)

SingIt! Singer Voice Transformation
by: Eliav, Amit, et al.
Published: (2024)

Controlling your Attributes in Voice
by: Li, Xuyuan, et al.
Published: (2025)

Objective Measurements of Voice Quality
by: Dhamyal, Hira, et al.
Published: (2024)

OneVoice: One Model, Triple Scenarios-Towards Unified Zero-shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2026)

Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model
by: Du, Zongyang, et al.
Published: (2024)

TidyVoice: A Curated Multilingual Dataset for Speaker Verification Derived from Common Voice
by: Farhadipour, Aref, et al.
Published: (2026)

Enhancing Polyglot Voices by Leveraging Cross-Lingual Fine-Tuning in Any-to-One Voice Conversion
by: Ruggiero, Giuseppe, et al.
Published: (2024)

Machine Unlearning in Speech Emotion Recognition via Forget Set Alone
by: Ren, Zhao, et al.
Published: (2025)

StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)

Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
by: Kim, Heeseung, et al.
Published: (2025)

Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India
by: Bhogale, Kaushal, et al.
Published: (2026)

Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
by: Sha, Binzhu, et al.
Published: (2023)

Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap
by: Lin, Yueqian, et al.
Published: (2025)

Revolutionizing Personalized Voice Synthesis: The Journey towards Emotional and Individual Authenticity with DIVSE (Dynamic Individual Voice Synthesis Engine)
by: Shi, Fan
Published: (2023)

Generating Novel and Realistic Speakers for Voice Conversion
by: Chen, Meiying Melissa, et al.
Published: (2025)

Robust Singing Voice Transcription Serves Synthesis
by: Li, Ruiqi, et al.
Published: (2024)

Implementation and Applications of WakeWords Integrated with Speaker Recognition: A Case Study
by: Filho, Alexandre Costa Ferro, et al.
Published: (2024)

End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
by: Ren, Zhao, et al.
Published: (2025)