Saved in:
| Main Authors: | Jahani, Jandad, Dawodi, Mursal, Baktash, Jawid Ahmad |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.14062 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Structural Stress and Learned Helplessness in Afghanistan: A Multi-Layer Analysis of the AFSTRESS Dari Corpus
by: Baktash, Jawid Ahmad, et al.
Published: (2026)
by: Baktash, Jawid Ahmad, et al.
Published: (2026)
Female Student Population at Kabul University Before the 2021 Ban: Trends, Gender Parity, and Faculty-Level Dynamics
by: Baktash, Jawid Ahmad, et al.
Published: (2025)
by: Baktash, Jawid Ahmad, et al.
Published: (2025)
DariMis: Harm-Aware Modeling for Dari Misinformation Detection on YouTube
by: Baktash, Jawid Ahmad, et al.
Published: (2026)
by: Baktash, Jawid Ahmad, et al.
Published: (2026)
PashtoTTS-Bench: automated screening for low-resource non-Latin-script text-to-speech
by: Rahman, Hanif
Published: (2026)
by: Rahman, Hanif
Published: (2026)
From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench
by: Xu, Ke, et al.
Published: (2026)
by: Xu, Ke, et al.
Published: (2026)
CommonVoice-SpeechRE and RPG-MoGe: Advancing Speech Relation Extraction with a New Dataset and Multi-Order Generative Framework
by: Ning, Jinzhong, et al.
Published: (2025)
by: Ning, Jinzhong, et al.
Published: (2025)
LearnerVoice: A Dataset of Non-Native English Learners' Spontaneous Speech
by: Kim, Haechan, et al.
Published: (2024)
by: Kim, Haechan, et al.
Published: (2024)
Sonos Voice Control Bias Assessment Dataset: A Methodology for Demographic Bias Assessment in Voice Assistants
by: Sekkat, Chloé, et al.
Published: (2024)
by: Sekkat, Chloé, et al.
Published: (2024)
Persian MusicGen: A Large-Scale Dataset and Culturally-Aware Generative Model for Persian Music
by: Sameti, Mohammad Hossein, et al.
Published: (2026)
by: Sameti, Mohammad Hossein, et al.
Published: (2026)
Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India
by: Bhogale, Kaushal, et al.
Published: (2026)
by: Bhogale, Kaushal, et al.
Published: (2026)
Overcoming Data Scarcity in Multi-Dialectal Arabic ASR via Whisper Fine-Tuning
by: Özyilmaz, Ömer Tarik, et al.
Published: (2025)
by: Özyilmaz, Ömer Tarik, et al.
Published: (2025)
Lombard Speech Synthesis for Any Voice with Controllable Style Embeddings
by: Akti, Seymanur, et al.
Published: (2026)
by: Akti, Seymanur, et al.
Published: (2026)
MOSS-VoiceGenerator: Create Realistic Voices with Natural Language Descriptions
by: Huang, Kexin, et al.
Published: (2026)
by: Huang, Kexin, et al.
Published: (2026)
ArVoice: A Multi-Speaker Dataset for Arabic Speech Synthesis
by: Toyin, Hawau Olamide, et al.
Published: (2025)
by: Toyin, Hawau Olamide, et al.
Published: (2025)
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
by: Zhang, Yu, et al.
Published: (2024)
by: Zhang, Yu, et al.
Published: (2024)
Voice Communication Analysis in Esports
by: Vinot, Aymeric, et al.
Published: (2024)
by: Vinot, Aymeric, et al.
Published: (2024)
CASPER: A Large Scale Spontaneous Speech Dataset
by: Xiao, Cihan, et al.
Published: (2025)
by: Xiao, Cihan, et al.
Published: (2025)
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play
by: Shi, Yemin, et al.
Published: (2025)
by: Shi, Yemin, et al.
Published: (2025)
Analysis of Speech Temporal Dynamics in the Context of Speaker Verification and Voice Anonymization
by: Tomashenko, Natalia, et al.
Published: (2024)
by: Tomashenko, Natalia, et al.
Published: (2024)
Investigation for Relative Voice Impression Estimation
by: Fujita, Kenichi, et al.
Published: (2026)
by: Fujita, Kenichi, et al.
Published: (2026)
MimicLM: Zero-Shot Voice Imitation through Autoregressive Modeling of Pseudo-Parallel Speech Corpora
by: Feng, Tao, et al.
Published: (2026)
by: Feng, Tao, et al.
Published: (2026)
The Third VoicePrivacy Challenge: Preserving Emotional Expressiveness and Linguistic Content in Voice Anonymization
by: Tomashenko, Natalia, et al.
Published: (2026)
by: Tomashenko, Natalia, et al.
Published: (2026)
VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing
by: Zheng, Zhisheng, et al.
Published: (2025)
by: Zheng, Zhisheng, et al.
Published: (2025)
Alethia: A Foundational Encoder for Voice Deepfakes
by: Zhu, Yi, et al.
Published: (2026)
by: Zhu, Yi, et al.
Published: (2026)
Voice Biomarker Analysis and Automated Severity Classification of Dysarthric Speech in a Multilingual Context
by: Yeo, Eunjung
Published: (2024)
by: Yeo, Eunjung
Published: (2024)
Voice Adaptation for Swiss German
by: Stucki, Samuel, et al.
Published: (2025)
by: Stucki, Samuel, et al.
Published: (2025)
Marco-Voice Technical Report
by: Tian, Fengping, et al.
Published: (2025)
by: Tian, Fengping, et al.
Published: (2025)
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation
by: He, Haorui, et al.
Published: (2025)
by: He, Haorui, et al.
Published: (2025)
Vevo2: A Unified and Controllable Framework for Speech and Singing Voice Generation
by: Zhang, Xueyao, et al.
Published: (2025)
by: Zhang, Xueyao, et al.
Published: (2025)
A Review of Common Online Speaker Diarization Methods
by: Aperdannier, Roman, et al.
Published: (2024)
by: Aperdannier, Roman, et al.
Published: (2024)
A Preliminary Exploration with GPT-4o Voice Mode
by: Lin, Yu-Xiang, et al.
Published: (2025)
by: Lin, Yu-Xiang, et al.
Published: (2025)
VoiceBench: Benchmarking LLM-Based Voice Assistants
by: Chen, Yiming, et al.
Published: (2024)
by: Chen, Yiming, et al.
Published: (2024)
HumMusQA: A Human-written Music Understanding QA Benchmark Dataset
by: Weck, Benno, et al.
Published: (2026)
by: Weck, Benno, et al.
Published: (2026)
IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS
by: Sankar, Ashwin, et al.
Published: (2024)
by: Sankar, Ashwin, et al.
Published: (2024)
Where Do Backdoors Live? A Component-Level Analysis of Backdoor Propagation in Speech Language Models
by: Fortier, Alexandrine, et al.
Published: (2025)
by: Fortier, Alexandrine, et al.
Published: (2025)
From Minutes to Days: Scaling Intracranial Speech Decoding with Supervised Pretraining
by: Evanson, Linnea, et al.
Published: (2025)
by: Evanson, Linnea, et al.
Published: (2025)
Finding My Voice: Generative Reconstruction of Disordered Speech for Automated Clinical Evaluation
by: Rosero, Karen, et al.
Published: (2025)
by: Rosero, Karen, et al.
Published: (2025)
Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge
by: Remme, Lian, et al.
Published: (2025)
by: Remme, Lian, et al.
Published: (2025)
VoxVietnam: a Large-Scale Multi-Genre Dataset for Vietnamese Speaker Recognition
by: Vu, Hoang Long, et al.
Published: (2024)
by: Vu, Hoang Long, et al.
Published: (2024)
Pashto Common Voice: Building the First Open Speech Corpus for a 60-Million-Speaker Low-Resource Language
by: Rahman, Hanif, et al.
Published: (2026)
by: Rahman, Hanif, et al.
Published: (2026)
Similar Items
-
Structural Stress and Learned Helplessness in Afghanistan: A Multi-Layer Analysis of the AFSTRESS Dari Corpus
by: Baktash, Jawid Ahmad, et al.
Published: (2026) -
Female Student Population at Kabul University Before the 2021 Ban: Trends, Gender Parity, and Faculty-Level Dynamics
by: Baktash, Jawid Ahmad, et al.
Published: (2025) -
DariMis: Harm-Aware Modeling for Dari Misinformation Detection on YouTube
by: Baktash, Jawid Ahmad, et al.
Published: (2026) -
PashtoTTS-Bench: automated screening for low-resource non-Latin-script text-to-speech
by: Rahman, Hanif
Published: (2026) -
From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench
by: Xu, Ke, et al.
Published: (2026)