:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jahani, Jandad, Dawodi, Mursal, Baktash, Jawid Ahmad
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Sound
Online Access:	https://arxiv.org/abs/2602.14062
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Structural Stress and Learned Helplessness in Afghanistan: A Multi-Layer Analysis of the AFSTRESS Dari Corpus
by: Baktash, Jawid Ahmad, et al.
Published: (2026)

Female Student Population at Kabul University Before the 2021 Ban: Trends, Gender Parity, and Faculty-Level Dynamics
by: Baktash, Jawid Ahmad, et al.
Published: (2025)

DariMis: Harm-Aware Modeling for Dari Misinformation Detection on YouTube
by: Baktash, Jawid Ahmad, et al.
Published: (2026)

PashtoTTS-Bench: automated screening for low-resource non-Latin-script text-to-speech
by: Rahman, Hanif
Published: (2026)

From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench
by: Xu, Ke, et al.
Published: (2026)

CommonVoice-SpeechRE and RPG-MoGe: Advancing Speech Relation Extraction with a New Dataset and Multi-Order Generative Framework
by: Ning, Jinzhong, et al.
Published: (2025)

LearnerVoice: A Dataset of Non-Native English Learners' Spontaneous Speech
by: Kim, Haechan, et al.
Published: (2024)

Sonos Voice Control Bias Assessment Dataset: A Methodology for Demographic Bias Assessment in Voice Assistants
by: Sekkat, Chloé, et al.
Published: (2024)

Persian MusicGen: A Large-Scale Dataset and Culturally-Aware Generative Model for Persian Music
by: Sameti, Mohammad Hossein, et al.
Published: (2026)

Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India
by: Bhogale, Kaushal, et al.
Published: (2026)

Overcoming Data Scarcity in Multi-Dialectal Arabic ASR via Whisper Fine-Tuning
by: Özyilmaz, Ömer Tarik, et al.
Published: (2025)

Lombard Speech Synthesis for Any Voice with Controllable Style Embeddings
by: Akti, Seymanur, et al.
Published: (2026)

MOSS-VoiceGenerator: Create Realistic Voices with Natural Language Descriptions
by: Huang, Kexin, et al.
Published: (2026)

ArVoice: A Multi-Speaker Dataset for Arabic Speech Synthesis
by: Toyin, Hawau Olamide, et al.
Published: (2025)

TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
by: Zhang, Yu, et al.
Published: (2024)

Voice Communication Analysis in Esports
by: Vinot, Aymeric, et al.
Published: (2024)

CASPER: A Large Scale Spontaneous Speech Dataset
by: Xiao, Cihan, et al.
Published: (2025)

Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play
by: Shi, Yemin, et al.
Published: (2025)

Analysis of Speech Temporal Dynamics in the Context of Speaker Verification and Voice Anonymization
by: Tomashenko, Natalia, et al.
Published: (2024)

Investigation for Relative Voice Impression Estimation
by: Fujita, Kenichi, et al.
Published: (2026)

MimicLM: Zero-Shot Voice Imitation through Autoregressive Modeling of Pseudo-Parallel Speech Corpora
by: Feng, Tao, et al.
Published: (2026)

The Third VoicePrivacy Challenge: Preserving Emotional Expressiveness and Linguistic Content in Voice Anonymization
by: Tomashenko, Natalia, et al.
Published: (2026)

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing
by: Zheng, Zhisheng, et al.
Published: (2025)

Alethia: A Foundational Encoder for Voice Deepfakes
by: Zhu, Yi, et al.
Published: (2026)

Voice Biomarker Analysis and Automated Severity Classification of Dysarthric Speech in a Multilingual Context
by: Yeo, Eunjung
Published: (2024)

Voice Adaptation for Swiss German
by: Stucki, Samuel, et al.
Published: (2025)

Marco-Voice Technical Report
by: Tian, Fengping, et al.
Published: (2025)

Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation
by: He, Haorui, et al.
Published: (2025)

Vevo2: A Unified and Controllable Framework for Speech and Singing Voice Generation
by: Zhang, Xueyao, et al.
Published: (2025)

A Review of Common Online Speaker Diarization Methods
by: Aperdannier, Roman, et al.
Published: (2024)

A Preliminary Exploration with GPT-4o Voice Mode
by: Lin, Yu-Xiang, et al.
Published: (2025)

VoiceBench: Benchmarking LLM-Based Voice Assistants
by: Chen, Yiming, et al.
Published: (2024)

HumMusQA: A Human-written Music Understanding QA Benchmark Dataset
by: Weck, Benno, et al.
Published: (2026)

IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS
by: Sankar, Ashwin, et al.
Published: (2024)

Where Do Backdoors Live? A Component-Level Analysis of Backdoor Propagation in Speech Language Models
by: Fortier, Alexandrine, et al.
Published: (2025)

From Minutes to Days: Scaling Intracranial Speech Decoding with Supervised Pretraining
by: Evanson, Linnea, et al.
Published: (2025)

Finding My Voice: Generative Reconstruction of Disordered Speech for Automated Clinical Evaluation
by: Rosero, Karen, et al.
Published: (2025)

Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge
by: Remme, Lian, et al.
Published: (2025)

VoxVietnam: a Large-Scale Multi-Genre Dataset for Vietnamese Speaker Recognition
by: Vu, Hoang Long, et al.
Published: (2024)

Pashto Common Voice: Building the First Open Speech Corpus for a 60-Million-Speaker Low-Resource Language
by: Rahman, Hanif, et al.
Published: (2026)