Enregistré dans:
| Auteurs principaux: | Bomediano, A. V., Conanan, R. J., Santuyo, L. D., Coronel, A. |
|---|---|
| Format: | Preprint |
| Publié: |
2025
|
| Sujets: | |
| Accès en ligne: | https://arxiv.org/abs/2510.27530 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
Documents similaires
Quantum-Enhanced Analysis and Grading of Vocal Performance
par: Agarwal, Rohan
Publié: (2025)
par: Agarwal, Rohan
Publié: (2025)
ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
par: Kim, Minu, et autres
Publié: (2025)
par: Kim, Minu, et autres
Publié: (2025)
Machine Learning Framework for Audio-Based Content Evaluation using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering
par: Aristorenas, Aris J.
Publié: (2024)
par: Aristorenas, Aris J.
Publié: (2024)
SoundPlot: An Open-Source Framework for Birdsong Acoustic Analysis and Neural Synthesis with Interactive 3D Visualization
par: Mehdi, Naqcho Ali, et autres
Publié: (2026)
par: Mehdi, Naqcho Ali, et autres
Publié: (2026)
Revisiting SSL for sound event detection: complementary fusion and adaptive post-processing
par: Cui, Hanfang, et autres
Publié: (2025)
par: Cui, Hanfang, et autres
Publié: (2025)
Joint Estimation of Piano Dynamics and Metrical Structure with a Multi-task Multi-Scale Network
par: He, Zhanhong, et autres
Publié: (2025)
par: He, Zhanhong, et autres
Publié: (2025)
Improving Cross-Lingual Phonetic Representation of Low-Resource Languages Through Language Similarity Analysis
par: Kim, Minu, et autres
Publié: (2025)
par: Kim, Minu, et autres
Publié: (2025)
STRUM: A Spectral Transcription and Rhythm Understanding Model for End-to-End Generation of Playable Rhythm-Game Charts
par: Opria, Joshua
Publié: (2026)
par: Opria, Joshua
Publié: (2026)
Crossing the Species Divide: Transfer Learning from Speech to Animal Sounds
par: Cauzinille, Jules, et autres
Publié: (2025)
par: Cauzinille, Jules, et autres
Publié: (2025)
Distilled HuBERT for Mobile Speech Emotion Recognition: A Cross-Corpus Validation Study
par: Ismail, Saifelden M.
Publié: (2025)
par: Ismail, Saifelden M.
Publié: (2025)
HELIX: Scaling Raw Audio Understanding with Hybrid Mamba-Attention Beyond the Quadratic Limit
par: Khushiyant, et autres
Publié: (2026)
par: Khushiyant, et autres
Publié: (2026)
Prevailing Research Areas for Music AI in the Era of Foundation Models
par: Wei, Megan, et autres
Publié: (2024)
par: Wei, Megan, et autres
Publié: (2024)
Neural Proxies for Sound Synthesizers: Learning Perceptually Informed Preset Representations
par: Combes, Paolo, et autres
Publié: (2025)
par: Combes, Paolo, et autres
Publié: (2025)
The evolution of inharmonicity and noisiness in contemporary popular music
par: Deruty, Emmanuel, et autres
Publié: (2024)
par: Deruty, Emmanuel, et autres
Publié: (2024)
OBHS: An Optimized Block Huffman Scheme for Real-Time Audio Compression
par: Mahfi, Muntahi Safwan, et autres
Publié: (2025)
par: Mahfi, Muntahi Safwan, et autres
Publié: (2025)
APEX: Large-scale Multi-task Aesthetic-Informed Popularity Prediction for AI-Generated Music
par: Husain, Jaavid Aktar, et autres
Publié: (2026)
par: Husain, Jaavid Aktar, et autres
Publié: (2026)
Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning
par: Semenov, Andrei, et autres
Publié: (2024)
par: Semenov, Andrei, et autres
Publié: (2024)
BEAT: Tokenizing and Generating Symbolic Music by Uniform Temporal Steps
par: Qian, Lekai, et autres
Publié: (2026)
par: Qian, Lekai, et autres
Publié: (2026)
Masked Contrastive Pre-Training Improves Music Audio Key Detection
par: Yonay, Ori, et autres
Publié: (2026)
par: Yonay, Ori, et autres
Publié: (2026)
Understanding the Algorithm Behind Audio Key Detection
par: Silva, Henrique Perez G.
Publié: (2025)
par: Silva, Henrique Perez G.
Publié: (2025)
Uncovering Population PK Covariates from VAE-Generated Latent Spaces
par: Perazzolo, Diego, et autres
Publié: (2025)
par: Perazzolo, Diego, et autres
Publié: (2025)
VocSim: A Training-free Benchmark for Zero-shot Content Identity in Single-source Audio
par: Basha, Maris, et autres
Publié: (2025)
par: Basha, Maris, et autres
Publié: (2025)
Enhancing Document AI Data Generation Through Graph-Based Synthetic Layouts
par: Agarwal, Amit, et autres
Publié: (2024)
par: Agarwal, Amit, et autres
Publié: (2024)
PromptReverb: Multimodal Room Impulse Response Generation Through Latent Rectified Flow Matching
par: Vosoughi, Ali, et autres
Publié: (2025)
par: Vosoughi, Ali, et autres
Publié: (2025)
Singing Timbre Popularity Assessment Based on Multimodal Large Foundation Model
par: Wang, Zihao, et autres
Publié: (2025)
par: Wang, Zihao, et autres
Publié: (2025)
The Binding Effect: Analyzing How Multi-Dimensional Cues Form Gender Bias in Instruction TTS
par: Chen, Kuan-Yu, et autres
Publié: (2026)
par: Chen, Kuan-Yu, et autres
Publié: (2026)
A Spatio-Temporal Deep Learning Approach For High-Resolution Gridded Monsoon Prediction
par: Borah, Parashjyoti, et autres
Publié: (2026)
par: Borah, Parashjyoti, et autres
Publié: (2026)
SemAlignVC: Enhancing zero-shot timbre conversion using semantic alignment
par: Mehta, Shivam, et autres
Publié: (2025)
par: Mehta, Shivam, et autres
Publié: (2025)
Contract-Driven QoE Auditing for Speech and Singing Services: From MOS Regression to Service Graphs
par: Du, Wenzhang
Publié: (2025)
par: Du, Wenzhang
Publié: (2025)
MuQ-Eval: An Open-Source Per-Sample Quality Metric for AI Music Generation Evaluation
par: Zhu, Di, et autres
Publié: (2026)
par: Zhu, Di, et autres
Publié: (2026)
Measuring Robustness of Speech Recognition from MEG Signals Under Distribution Shift
par: Chien, Sheng-You, et autres
Publié: (2026)
par: Chien, Sheng-You, et autres
Publié: (2026)
Modeling L1 Influence on L2 Pronunciation: An MFCC-Based Framework for Explainable Machine Learning and Pedagogical Feedback
par: Jahanbin, Peyman
Publié: (2025)
par: Jahanbin, Peyman
Publié: (2025)
M6(GPT)3: Generating Multitrack Modifiable Multi-Minute MIDI Music from Text using Genetic algorithms, Probabilistic methods and GPT Models in any Progression and Time Signature
par: Poćwiardowski, Jakub, et autres
Publié: (2024)
par: Poćwiardowski, Jakub, et autres
Publié: (2024)
Benchmarking Sub-Genre Classification For Mainstage Dance Music
par: Shu, Hongzhi, et autres
Publié: (2024)
par: Shu, Hongzhi, et autres
Publié: (2024)
Generation of Musical Timbres using a Text-Guided Diffusion Model
par: Yuan, Weixuan, et autres
Publié: (2025)
par: Yuan, Weixuan, et autres
Publié: (2025)
Self-Improvement for Audio Large Language Model using Unlabeled Speech
par: Wang, Shaowen, et autres
Publié: (2025)
par: Wang, Shaowen, et autres
Publié: (2025)
MAIN-VC: Lightweight Speech Representation Disentanglement for One-shot Voice Conversion
par: Li, Pengcheng, et autres
Publié: (2024)
par: Li, Pengcheng, et autres
Publié: (2024)
MaskClip: Detachable Clip-on Piezoelectric Sensing of Mask Surface Vibrations for Real-time Noise-Robust Speech Input
par: Hiraki, Hirotaka, et autres
Publié: (2025)
par: Hiraki, Hirotaka, et autres
Publié: (2025)
Audio Foundation Models Outperform Symbolic Representations for Piano Performance Evaluation
par: Dhiman, Jai
Publié: (2026)
par: Dhiman, Jai
Publié: (2026)
Named entity recognition for Serbian legal documents: Design, methodology and dataset development
par: Kalušev, Vladimir, et autres
Publié: (2025)
par: Kalušev, Vladimir, et autres
Publié: (2025)
Documents similaires
-
Quantum-Enhanced Analysis and Grading of Vocal Performance
par: Agarwal, Rohan
Publié: (2025) -
ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
par: Kim, Minu, et autres
Publié: (2025) -
Machine Learning Framework for Audio-Based Content Evaluation using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering
par: Aristorenas, Aris J.
Publié: (2024) -
SoundPlot: An Open-Source Framework for Birdsong Acoustic Analysis and Neural Synthesis with Interactive 3D Visualization
par: Mehdi, Naqcho Ali, et autres
Publié: (2026) -
Revisiting SSL for sound event detection: complementary fusion and adaptive post-processing
par: Cui, Hanfang, et autres
Publié: (2025)