Guardado en:
| Autores principales: | Shafiei, Sepideh, Hakam, Shapour |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2503.11956 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
ShrutiSense: Microtonal Modeling and Correction in Indian Classical Music
por: Ghosh, Rajarshi, et al.
Publicado: (2025)
por: Ghosh, Rajarshi, et al.
Publicado: (2025)
Voice Conversion with Diverse Intonation using Conditional Variational Auto-Encoder
por: Suh, Soobin, et al.
Publicado: (2025)
por: Suh, Soobin, et al.
Publicado: (2025)
Live Vocal Extraction from K-pop Performances
por: Kim, Yujin, et al.
Publicado: (2025)
por: Kim, Yujin, et al.
Publicado: (2025)
Enhancing Child Vocalization Classification with Phonetically-Tuned Embeddings for Assisting Autism Diagnosis
por: Li, Jialu, et al.
Publicado: (2023)
por: Li, Jialu, et al.
Publicado: (2023)
Spectrogram-Based Detection of Auto-Tuned Vocals in Music Recordings
por: Gohari, Mahyar, et al.
Publicado: (2024)
por: Gohari, Mahyar, et al.
Publicado: (2024)
Hearing Health in Home Healthcare: Leveraging LLMs for Illness Scoring and ALMs for Vocal Biomarker Extraction
por: Chen, Yu-Wen, et al.
Publicado: (2025)
por: Chen, Yu-Wen, et al.
Publicado: (2025)
Mel-RoFormer for Vocal Separation and Vocal Melody Transcription
por: Wang, Ju-Chiang, et al.
Publicado: (2024)
por: Wang, Ju-Chiang, et al.
Publicado: (2024)
Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings
por: Zhao, He, et al.
Publicado: (2024)
por: Zhao, He, et al.
Publicado: (2024)
Bird Vocalization Embedding Extraction Using Self-Supervised Disentangled Representation Learning
por: Shi, Runwu, et al.
Publicado: (2024)
por: Shi, Runwu, et al.
Publicado: (2024)
ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis
por: He, Xiangheng, et al.
Publicado: (2024)
por: He, Xiangheng, et al.
Publicado: (2024)
CVSM: Contrastive Vocal Similarity Modeling
por: Garoufis, Christos, et al.
Publicado: (2025)
por: Garoufis, Christos, et al.
Publicado: (2025)
Audio Enhancement from Multiple Crowdsourced Recordings: A Simple and Effective Baseline
por: Aziz, Shiran, et al.
Publicado: (2024)
por: Aziz, Shiran, et al.
Publicado: (2024)
SwiftF0: Fast and Accurate Monophonic Pitch Detection
por: Nieradzik, Lars
Publicado: (2025)
por: Nieradzik, Lars
Publicado: (2025)
Melodic and Metrical Elements of Expressiveness in Hindustani Vocal Music
por: Bhake, Yash, et al.
Publicado: (2025)
por: Bhake, Yash, et al.
Publicado: (2025)
Auditory Representation Effective for Estimating Vocal Tract Information
por: Irino, Toshio, et al.
Publicado: (2023)
por: Irino, Toshio, et al.
Publicado: (2023)
Algebraic Structures in Microtonal Music
por: Flynn, Veronica, et al.
Publicado: (2025)
por: Flynn, Veronica, et al.
Publicado: (2025)
A Reliable and Efficient Detection Pipeline for Rodent Ultrasonic Vocalizations
por: Anis, Sabah Shahnoor, et al.
Publicado: (2025)
por: Anis, Sabah Shahnoor, et al.
Publicado: (2025)
Drum-to-Vocal Percussion Sound Conversion and Its Evaluation Methodology
por: Nobukawa, Rinka, et al.
Publicado: (2025)
por: Nobukawa, Rinka, et al.
Publicado: (2025)
Biodenoising: Animal Vocalization Denoising without Access to Clean Data
por: Miron, Marius, et al.
Publicado: (2024)
por: Miron, Marius, et al.
Publicado: (2024)
voc2vec: A Foundation Model for Non-Verbal Vocalization
por: Koudounas, Alkis, et al.
Publicado: (2025)
por: Koudounas, Alkis, et al.
Publicado: (2025)
Analysis of Self-Supervised Speech Models on Children's Speech and Infant Vocalizations
por: Li, Jialu, et al.
Publicado: (2024)
por: Li, Jialu, et al.
Publicado: (2024)
RMVPE: A Robust Model for Vocal Pitch Estimation in Polyphonic Music
por: Wei, Haojie, et al.
Publicado: (2023)
por: Wei, Haojie, et al.
Publicado: (2023)
Robust Audio-Visual Target Speaker Extraction with Emotion-Aware Multiple Enrollment Fusion
por: Jin, Zhan, et al.
Publicado: (2025)
por: Jin, Zhan, et al.
Publicado: (2025)
DiffVox: A Differentiable Model for Capturing and Analysing Vocal Effects Distributions
por: Yu, Chin-Yun, et al.
Publicado: (2025)
por: Yu, Chin-Yun, et al.
Publicado: (2025)
Learning Vocal-Tract Area and Radiation with a Physics-Informed Webster Model
por: Lu, Minhui, et al.
Publicado: (2026)
por: Lu, Minhui, et al.
Publicado: (2026)
Libri2Vox Dataset: Target Speaker Extraction with Diverse Speaker Conditions and Synthetic Data
por: Liu, Yun, et al.
Publicado: (2024)
por: Liu, Yun, et al.
Publicado: (2024)
Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription
por: Ríos-Vila, Antonio, et al.
Publicado: (2024)
por: Ríos-Vila, Antonio, et al.
Publicado: (2024)
DJCM: A Deep Joint Cascade Model for Singing Voice Separation and Vocal Pitch Estimation
por: Wei, Haojie, et al.
Publicado: (2024)
por: Wei, Haojie, et al.
Publicado: (2024)
Relating the Neural Representations of Vocalized, Mimed, and Imagined Speech
por: Maghsoudi, Maryam, et al.
Publicado: (2026)
por: Maghsoudi, Maryam, et al.
Publicado: (2026)
The IRMA Dataset: A Structured Audio-MIDI Corpus for Iranian Classical Music
por: Shafiei, Sepideh, et al.
Publicado: (2025)
por: Shafiei, Sepideh, et al.
Publicado: (2025)
Spatial Reverberation and Dereverberation using an Acoustic Multiple-Input Multiple-Output System
por: Morgenstern, Hai, et al.
Publicado: (2024)
por: Morgenstern, Hai, et al.
Publicado: (2024)
VocalAgent: Large Language Models for Vocal Health Diagnostics with Safety-Aware Evaluation
por: Kim, Yubin, et al.
Publicado: (2025)
por: Kim, Yubin, et al.
Publicado: (2025)
Lightweight Self-Supervised Detection of Fundamental Frequency and Accurate Probability of Voicing in Monophonic Music
por: Bitra, Venkat Suprabath, et al.
Publicado: (2026)
por: Bitra, Venkat Suprabath, et al.
Publicado: (2026)
Analyzing Byte-Pair Encoding on Monophonic and Polyphonic Symbolic Music: A Focus on Musical Phrase Segmentation
por: Le, Dinh-Viet-Toan, et al.
Publicado: (2024)
por: Le, Dinh-Viet-Toan, et al.
Publicado: (2024)
A System for Melodic Harmonization using Schoenberg Regions, Giant Steps, and Church Modes
por: Fernandes, Frederick
Publicado: (2025)
por: Fernandes, Frederick
Publicado: (2025)
Improved Feature Extraction Network for Neuro-Oriented Target Speaker Extraction
por: Fan, Cunhang, et al.
Publicado: (2025)
por: Fan, Cunhang, et al.
Publicado: (2025)
Towards the Synthesis of Non-speech Vocalizations
por: Hoq, Enjamamul, et al.
Publicado: (2024)
por: Hoq, Enjamamul, et al.
Publicado: (2024)
Two-stage Audio-Visual Target Speaker Extraction System for Real-Time Processing On Edge Device
por: Li, Zixuan, et al.
Publicado: (2025)
por: Li, Zixuan, et al.
Publicado: (2025)
Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings
por: Horiguchi, Shota, et al.
Publicado: (2024)
por: Horiguchi, Shota, et al.
Publicado: (2024)
Feature Representations for Automatic Meerkat Vocalization Classification
por: Mahmoud, Imen Ben, et al.
Publicado: (2024)
por: Mahmoud, Imen Ben, et al.
Publicado: (2024)
Ejemplares similares
-
ShrutiSense: Microtonal Modeling and Correction in Indian Classical Music
por: Ghosh, Rajarshi, et al.
Publicado: (2025) -
Voice Conversion with Diverse Intonation using Conditional Variational Auto-Encoder
por: Suh, Soobin, et al.
Publicado: (2025) -
Live Vocal Extraction from K-pop Performances
por: Kim, Yujin, et al.
Publicado: (2025) -
Enhancing Child Vocalization Classification with Phonetically-Tuned Embeddings for Assisting Autism Diagnosis
por: Li, Jialu, et al.
Publicado: (2023) -
Spectrogram-Based Detection of Auto-Tuned Vocals in Music Recordings
por: Gohari, Mahyar, et al.
Publicado: (2024)