Saved in:
| Main Authors: | Andrade-Miranda, G., Chatzipapas, K., Arias-Londoño, J. D., Godino-Llorente, J. I. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.15054 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Singing Voice Synthesis Using Differentiable LPC and Glottal-Flow-Inspired Wavetables
by: Yu, Chin-Yun, et al.
Published: (2023)
by: Yu, Chin-Yun, et al.
Published: (2023)
NeuroVoz: a Castillian Spanish corpus of parkinsonian speech
by: Mendes-Laureano, Janaína, et al.
Published: (2024)
by: Mendes-Laureano, Janaína, et al.
Published: (2024)
SNC: A Stem-Native Codec for Efficient Lossless Audio Storage with Adaptive Playback Capabilities
by: Sufi, Shaad
Published: (2026)
by: Sufi, Shaad
Published: (2026)
Modeling and Estimation of Vocal Tract and Glottal Source Parameters Using ARMAX-LF Model
by: Lia, Kai, et al.
Published: (2024)
by: Lia, Kai, et al.
Published: (2024)
Del Visual al Auditivo: Sonorización de Escenas Guiada por Imagen
by: Sánchez, María, et al.
Published: (2024)
by: Sánchez, María, et al.
Published: (2024)
Construction and Analysis of Impression Caption Dataset for Environmental Sounds
by: Okamoto, Yuki, et al.
Published: (2024)
by: Okamoto, Yuki, et al.
Published: (2024)
Baseline Systems and Evaluation Metrics for Spatial Semantic Segmentation of Sound Scenes
by: Nguyen, Binh Thien, et al.
Published: (2025)
by: Nguyen, Binh Thien, et al.
Published: (2025)
Can Audio Reveal Music Performance Difficulty? Insights from the Piano Syllabus Dataset
by: Ramoneda, Pedro, et al.
Published: (2024)
by: Ramoneda, Pedro, et al.
Published: (2024)
DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset
by: Du, Jiawei, et al.
Published: (2024)
by: Du, Jiawei, et al.
Published: (2024)
Unseen but not Unknown: Using Dataset Concealment to Robustly Evaluate Speech Quality Estimation Models
by: Pieper, Jaden, et al.
Published: (2026)
by: Pieper, Jaden, et al.
Published: (2026)
MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation
by: Liu, Cheng, et al.
Published: (2025)
by: Liu, Cheng, et al.
Published: (2025)
SimuSOE: A Simulated Snoring Dataset for Obstructive Sleep Apnea-Hypopnea Syndrome Evaluation during Wakefulness
by: Lin, Jie, et al.
Published: (2024)
by: Lin, Jie, et al.
Published: (2024)
CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset
by: Chen, Xuanjun, et al.
Published: (2025)
by: Chen, Xuanjun, et al.
Published: (2025)
Toward Multimodal Industrial Fault Analysis: A Single-Speed Chain Conveyor Dataset with Audio and Vibration Signals
by: Chen, Zhang, et al.
Published: (2026)
by: Chen, Zhang, et al.
Published: (2026)
Who Said What WSW 2.0? Enhanced Automated Analysis of Preschool Classroom Speech
by: Sun, Anchen, et al.
Published: (2025)
by: Sun, Anchen, et al.
Published: (2025)
An Extensive Analysis of the Singing Voice Conversion Challenge 2025 Evaluation Results
by: Violeta, Lester Phillip, et al.
Published: (2025)
by: Violeta, Lester Phillip, et al.
Published: (2025)
Comparative Evaluation of Acoustic Feature Extraction Tools for Clinical Speech Analysis
by: Choi, Anna Seo Gyeong, et al.
Published: (2025)
by: Choi, Anna Seo Gyeong, et al.
Published: (2025)
Facilitating deep acoustic phenotyping: A basic coding scheme of infant vocalisations preluding computational analysis, machine learning and clinical reasoning
by: Kulvicius, Tomas, et al.
Published: (2023)
by: Kulvicius, Tomas, et al.
Published: (2023)
Evaluating Parkinson's Disease Detection in Anonymized Speech: A Performance and Acoustic Analysis
by: Franzreb, Carlos, et al.
Published: (2026)
by: Franzreb, Carlos, et al.
Published: (2026)
ASPED: An Audio Dataset for Detecting Pedestrians
by: Seshadri, Pavan, et al.
Published: (2023)
by: Seshadri, Pavan, et al.
Published: (2023)
STraDa: A Singer Traits Dataset
by: Kong, Yuexuan, et al.
Published: (2024)
by: Kong, Yuexuan, et al.
Published: (2024)
Mamba-based Segmentation Model for Speaker Diarization
by: Plaquet, Alexis, et al.
Published: (2024)
by: Plaquet, Alexis, et al.
Published: (2024)
A Dataset for Automatic Assessment of TTS Quality in Spanish
by: Welford, Alejandro Sosa, et al.
Published: (2025)
by: Welford, Alejandro Sosa, et al.
Published: (2025)
Audio-Language Datasets of Scenes and Events: A Survey
by: Wijngaard, Gijs, et al.
Published: (2024)
by: Wijngaard, Gijs, et al.
Published: (2024)
CUEMPATHY: A Counseling Speech Dataset for Psychotherapy Research
by: Tao, Dehua, et al.
Published: (2024)
by: Tao, Dehua, et al.
Published: (2024)
Dataset-Distillation Generative Model for Speech Emotion Recognition
by: Ritter-Gutierrez, Fabian, et al.
Published: (2024)
by: Ritter-Gutierrez, Fabian, et al.
Published: (2024)
MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
by: Müller, Nicolas M., et al.
Published: (2024)
by: Müller, Nicolas M., et al.
Published: (2024)
Vision Transformer Segmentation for Visual Bird Sound Denoising
by: Kumar, Sahil, et al.
Published: (2024)
by: Kumar, Sahil, et al.
Published: (2024)
Advancing Singlish Understanding: Bridging the Gap with Datasets and Multimodal Models
by: Wang, Bin, et al.
Published: (2025)
by: Wang, Bin, et al.
Published: (2025)
Advancing Topic Segmentation of Broadcasted Speech with Multilingual Semantic Embeddings
by: Shukla, Sakshi Deo, et al.
Published: (2024)
by: Shukla, Sakshi Deo, et al.
Published: (2024)
Comparative Analysis of ASR Methods for Speech Deepfake Detection
by: Salvi, Davide, et al.
Published: (2024)
by: Salvi, Davide, et al.
Published: (2024)
SoundCollage: Automated Discovery of New Classes in Audio Datasets
by: Choi, Ryuhaerang, et al.
Published: (2024)
by: Choi, Ryuhaerang, et al.
Published: (2024)
UrBAN: Urban Beehive Acoustics and PheNotyping Dataset
by: Abdollahi, Mahsa, et al.
Published: (2024)
by: Abdollahi, Mahsa, et al.
Published: (2024)
EmoFake: An Initial Dataset for Emotion Fake Audio Detection
by: Zhao, Yan, et al.
Published: (2022)
by: Zhao, Yan, et al.
Published: (2022)
The Florence Price Art Song Dataset and Piano Accompaniment Generator
by: He, Tao-Tao, et al.
Published: (2025)
by: He, Tao-Tao, et al.
Published: (2025)
Binamix -- A Python Library for Generating Binaural Audio Datasets
by: Barry, Dan, et al.
Published: (2025)
by: Barry, Dan, et al.
Published: (2025)
ICSD: An Open-source Dataset for Infant Cry and Snoring Detection
by: Liu, Qingyu, et al.
Published: (2024)
by: Liu, Qingyu, et al.
Published: (2024)
The Extended SONICOM HRTF Dataset and Spatial Audio Metrics Toolbox
by: Poole, Katarina C., et al.
Published: (2025)
by: Poole, Katarina C., et al.
Published: (2025)
Advances in Speech Separation: Techniques, Challenges, and Future Trends
by: Li, Kai, et al.
Published: (2025)
by: Li, Kai, et al.
Published: (2025)
ASAudio: A Survey of Advanced Spatial Audio Research
by: Zhu, Zhiyuan, et al.
Published: (2025)
by: Zhu, Zhiyuan, et al.
Published: (2025)
Similar Items
-
Singing Voice Synthesis Using Differentiable LPC and Glottal-Flow-Inspired Wavetables
by: Yu, Chin-Yun, et al.
Published: (2023) -
NeuroVoz: a Castillian Spanish corpus of parkinsonian speech
by: Mendes-Laureano, Janaína, et al.
Published: (2024) -
SNC: A Stem-Native Codec for Efficient Lossless Audio Storage with Adaptive Playback Capabilities
by: Sufi, Shaad
Published: (2026) -
Modeling and Estimation of Vocal Tract and Glottal Source Parameters Using ARMAX-LF Model
by: Lia, Kai, et al.
Published: (2024) -
Del Visual al Auditivo: Sonorización de Escenas Guiada por Imagen
by: Sánchez, María, et al.
Published: (2024)