Saved in:
| Main Authors: | Uddin, Majbah, Huynh, Nathan, Vidal, Jose M, Taaffe, Kevin M, Fredendall, Lawrence D, Greenstein, Joel S |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.03369 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Loudspeaker Beamforming to Enhance Speech Recognition Performance of Voice Driven Applications
by: de Groot, Dimme, et al.
Published: (2025)
by: de Groot, Dimme, et al.
Published: (2025)
VoiceGRPO: Modern MoE Transformers with Group Relative Policy Optimization GRPO for AI Voice Health Care Applications on Voice Pathology Detection
by: Togootogtokh, Enkhtogtokh, et al.
Published: (2025)
by: Togootogtokh, Enkhtogtokh, et al.
Published: (2025)
TidyVoice 2026 Challenge Evaluation Plan
by: Farhadipour, Aref, et al.
Published: (2026)
by: Farhadipour, Aref, et al.
Published: (2026)
Descriptor:: Extended-Length Audio Dataset for Synthetic Voice Detection and Speaker Recognition (ELAD-SVDSR)
by: Vijaykumar, Rahul, et al.
Published: (2025)
by: Vijaykumar, Rahul, et al.
Published: (2025)
VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing
by: Zheng, Zhisheng, et al.
Published: (2025)
by: Zheng, Zhisheng, et al.
Published: (2025)
Vo-Ve: An Explainable Voice-Vector for Speaker Identity Evaluation
by: Lee, Jaejun, et al.
Published: (2025)
by: Lee, Jaejun, et al.
Published: (2025)
Development of the Listening in Spatialized Noise-Sentences (LiSN-S) Test in Brazilian Portuguese: Presentation Software, Speech Stimuli, and Sentence Equivalence
by: Masiero, Bruno S., et al.
Published: (2024)
by: Masiero, Bruno S., et al.
Published: (2024)
PERSONA: An Application for Emotion Recognition, Gender Recognition and Age Estimation
by: Koshal, Devyani, et al.
Published: (2024)
by: Koshal, Devyani, et al.
Published: (2024)
An Extensive Analysis of the Singing Voice Conversion Challenge 2025 Evaluation Results
by: Violeta, Lester Phillip, et al.
Published: (2025)
by: Violeta, Lester Phillip, et al.
Published: (2025)
End-to-End Integration of Speech Emotion Recognition with Voice Activity Detection using Self-Supervised Learning Features
by: Yamashita, Natsuo, et al.
Published: (2024)
by: Yamashita, Natsuo, et al.
Published: (2024)
Automatic Voice Classification Of Autistic Subjects
by: Vacca, Jessica, et al.
Published: (2024)
by: Vacca, Jessica, et al.
Published: (2024)
EchoVoices: Preserving Generational Voices and Memories for Seniors and Children
by: Xu, Haiying, et al.
Published: (2025)
by: Xu, Haiying, et al.
Published: (2025)
EMALG: An Enhanced Mandarin Lombard Grid Corpus with Meaningful Sentences
by: Li, Baifeng, et al.
Published: (2023)
by: Li, Baifeng, et al.
Published: (2023)
Auden-Voice: General-Purpose Voice Encoder for Speech and Language Understanding
by: Huo, Mingyue, et al.
Published: (2025)
by: Huo, Mingyue, et al.
Published: (2025)
Human Voice is Unique
by: Singh, Rita, et al.
Published: (2025)
by: Singh, Rita, et al.
Published: (2025)
AdaProj: Adaptively Scaled Angular Margin Subspace Projections for Anomalous Sound Detection with Auxiliary Classification Tasks
by: Wilkinghoff, Kevin
Published: (2024)
by: Wilkinghoff, Kevin
Published: (2024)
StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)
by: Wang, Zhichao, et al.
Published: (2024)
Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals
by: Seki, Kentaro, et al.
Published: (2024)
by: Seki, Kentaro, et al.
Published: (2024)
LatentVoiceGrad: Nonparallel Voice Conversion with Latent Diffusion/Flow-Matching Models
by: Kameoka, Hirokazu, et al.
Published: (2025)
by: Kameoka, Hirokazu, et al.
Published: (2025)
VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
by: Kameoka, Hirokazu, et al.
Published: (2020)
by: Kameoka, Hirokazu, et al.
Published: (2020)
Voice-ENHANCE: Speech Restoration using a Diffusion-based Voice Conversion Framework
by: Byun, Kyungguen, et al.
Published: (2025)
by: Byun, Kyungguen, et al.
Published: (2025)
A Transversal Study of Fundamental Frequency Contours in Parkinsonian Voices
by: Rodriguez-Perez, Pablo, et al.
Published: (2024)
by: Rodriguez-Perez, Pablo, et al.
Published: (2024)
SingIt! Singer Voice Transformation
by: Eliav, Amit, et al.
Published: (2024)
by: Eliav, Amit, et al.
Published: (2024)
Controlling your Attributes in Voice
by: Li, Xuyuan, et al.
Published: (2025)
by: Li, Xuyuan, et al.
Published: (2025)
Objective Measurements of Voice Quality
by: Dhamyal, Hira, et al.
Published: (2024)
by: Dhamyal, Hira, et al.
Published: (2024)
OneVoice: One Model, Triple Scenarios-Towards Unified Zero-shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2026)
by: Wang, Zhichao, et al.
Published: (2026)
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model
by: Du, Zongyang, et al.
Published: (2024)
by: Du, Zongyang, et al.
Published: (2024)
TidyVoice: A Curated Multilingual Dataset for Speaker Verification Derived from Common Voice
by: Farhadipour, Aref, et al.
Published: (2026)
by: Farhadipour, Aref, et al.
Published: (2026)
Enhancing Polyglot Voices by Leveraging Cross-Lingual Fine-Tuning in Any-to-One Voice Conversion
by: Ruggiero, Giuseppe, et al.
Published: (2024)
by: Ruggiero, Giuseppe, et al.
Published: (2024)
Machine Unlearning in Speech Emotion Recognition via Forget Set Alone
by: Ren, Zhao, et al.
Published: (2025)
by: Ren, Zhao, et al.
Published: (2025)
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)
by: Wang, Zhichao, et al.
Published: (2024)
Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
by: Kim, Heeseung, et al.
Published: (2025)
by: Kim, Heeseung, et al.
Published: (2025)
Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India
by: Bhogale, Kaushal, et al.
Published: (2026)
by: Bhogale, Kaushal, et al.
Published: (2026)
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
by: Sha, Binzhu, et al.
Published: (2023)
by: Sha, Binzhu, et al.
Published: (2023)
Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap
by: Lin, Yueqian, et al.
Published: (2025)
by: Lin, Yueqian, et al.
Published: (2025)
Revolutionizing Personalized Voice Synthesis: The Journey towards Emotional and Individual Authenticity with DIVSE (Dynamic Individual Voice Synthesis Engine)
by: Shi, Fan
Published: (2023)
by: Shi, Fan
Published: (2023)
Generating Novel and Realistic Speakers for Voice Conversion
by: Chen, Meiying Melissa, et al.
Published: (2025)
by: Chen, Meiying Melissa, et al.
Published: (2025)
Robust Singing Voice Transcription Serves Synthesis
by: Li, Ruiqi, et al.
Published: (2024)
by: Li, Ruiqi, et al.
Published: (2024)
Implementation and Applications of WakeWords Integrated with Speaker Recognition: A Case Study
by: Filho, Alexandre Costa Ferro, et al.
Published: (2024)
by: Filho, Alexandre Costa Ferro, et al.
Published: (2024)
End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
by: Ren, Zhao, et al.
Published: (2025)
by: Ren, Zhao, et al.
Published: (2025)
Similar Items
-
Loudspeaker Beamforming to Enhance Speech Recognition Performance of Voice Driven Applications
by: de Groot, Dimme, et al.
Published: (2025) -
VoiceGRPO: Modern MoE Transformers with Group Relative Policy Optimization GRPO for AI Voice Health Care Applications on Voice Pathology Detection
by: Togootogtokh, Enkhtogtokh, et al.
Published: (2025) -
TidyVoice 2026 Challenge Evaluation Plan
by: Farhadipour, Aref, et al.
Published: (2026) -
Descriptor:: Extended-Length Audio Dataset for Synthetic Voice Detection and Speaker Recognition (ELAD-SVDSR)
by: Vijaykumar, Rahul, et al.
Published: (2025) -
VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing
by: Zheng, Zhisheng, et al.
Published: (2025)