Saved in:
| Main Author: | Chandra, Joydeep |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.03039 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Emotion-Disentangled Embedding Alignment for Noise-Robust and Cross-Corpus Speech Emotion Recognition
by: Tiwari, Upasana, et al.
Published: (2025)
by: Tiwari, Upasana, et al.
Published: (2025)
Voice "Cloning" is Style Transfer
by: Zhou, Kaitlyn, et al.
Published: (2026)
by: Zhou, Kaitlyn, et al.
Published: (2026)
Evaluating Human-AI Interaction via Usability, User Experience and Acceptance Measures for MMM-C: A Creative AI System for Music Composition
by: Tchemeube, Renaud Bougueng, et al.
Published: (2025)
by: Tchemeube, Renaud Bougueng, et al.
Published: (2025)
What Does it Take to Generalize SER Model Across Datasets? A Comprehensive Benchmark
by: Ibrahim, Adham, et al.
Published: (2024)
by: Ibrahim, Adham, et al.
Published: (2024)
Plural Voices, Single Agent: Towards Inclusive AI in Multi-User Domestic Spaces
by: Chandra, Joydeep, et al.
Published: (2025)
by: Chandra, Joydeep, et al.
Published: (2025)
Arabic Little STT: Arabic Children Speech Recognition Dataset
by: Alkadri, Mouhand, et al.
Published: (2025)
by: Alkadri, Mouhand, et al.
Published: (2025)
Morse Code-Enabled Speech Recognition for Individuals with Visual and Hearing Impairments
by: Choudhury, Ritabrata Roy
Published: (2024)
by: Choudhury, Ritabrata Roy
Published: (2024)
EmoAugNet: A Signal-Augmented Hybrid CNN-LSTM Framework for Speech Emotion Recognition
by: Paul, Durjoy Chandra, et al.
Published: (2025)
by: Paul, Durjoy Chandra, et al.
Published: (2025)
Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication
by: Nakilcioglu, Emin Cagatay, et al.
Published: (2023)
by: Nakilcioglu, Emin Cagatay, et al.
Published: (2023)
Layer-Wise Analysis of Self-Supervised Representations for Age and Gender Classification in Children's Speech
by: Sinha, Abhijit, et al.
Published: (2025)
by: Sinha, Abhijit, et al.
Published: (2025)
Composers' Evaluations of an AI Music Tool: Insights for Human-Centred Design
by: Row, Eleanor, et al.
Published: (2024)
by: Row, Eleanor, et al.
Published: (2024)
Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
by: Geng, Mengzhe, et al.
Published: (2024)
by: Geng, Mengzhe, et al.
Published: (2024)
Music Generation using Human-In-The-Loop Reinforcement Learning
by: Justus, Aju Ani
Published: (2025)
by: Justus, Aju Ani
Published: (2025)
Learning Relationships Between Separate Audio Tracks for Creative Applications
by: Bujard, Balthazar, et al.
Published: (2025)
by: Bujard, Balthazar, et al.
Published: (2025)
Evaluating Co-Creativity using Total Information Flow
by: Gokul, Vignesh, et al.
Published: (2024)
by: Gokul, Vignesh, et al.
Published: (2024)
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
by: Xie, Zhifei, et al.
Published: (2024)
by: Xie, Zhifei, et al.
Published: (2024)
Call2Instruct: Automated Pipeline for Generating Q&A Datasets from Call Center Recordings for LLM Fine-Tuning
by: Echeverria, Alex, et al.
Published: (2025)
by: Echeverria, Alex, et al.
Published: (2025)
Clip-TTS: Contrastive Text-content and Mel-spectrogram, A High-Quality Text-to-Speech Method based on Contextual Semantic Understanding
by: Liu, Tianyun
Published: (2025)
by: Liu, Tianyun
Published: (2025)
Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese
by: Wang, Xihuai, et al.
Published: (2025)
by: Wang, Xihuai, et al.
Published: (2025)
EmoHeal: An End-to-End System for Personalized Therapeutic Music Retrieval from Fine-grained Emotions
by: Wan, Xinchen, et al.
Published: (2025)
by: Wan, Xinchen, et al.
Published: (2025)
Orchestrating Attention: Bringing Harmony to the 'Chaos' of Neurodivergent Learning States
by: Navneet, Satyam Kumar, et al.
Published: (2026)
by: Navneet, Satyam Kumar, et al.
Published: (2026)
An Intelligent AI glasses System with Multi-Agent Architecture for Real-Time Voice Processing and Task Execution
by: Chen, Sheng-Kai, et al.
Published: (2026)
by: Chen, Sheng-Kai, et al.
Published: (2026)
Same Words, Different Judgments: How Preferences Vary Across Modalities
by: Broukhim, Aaron, et al.
Published: (2026)
by: Broukhim, Aaron, et al.
Published: (2026)
Echoes of Humanity: Exploring the Perceived Humanness of AI Music
by: Figueiredo, Flavio, et al.
Published: (2025)
by: Figueiredo, Flavio, et al.
Published: (2025)
BREATH: A Bio-Radar Embodied Agent for Tonal and Human-Aware Diffusion Music Generation
by: Wang, Yunzhe, et al.
Published: (2025)
by: Wang, Yunzhe, et al.
Published: (2025)
Opening Musical Creativity? Embedded Ideologies in Generative-AI Music Systems
by: Pram, Liam, et al.
Published: (2025)
by: Pram, Liam, et al.
Published: (2025)
The Ghost in the Keys: A Disklavier Demo for Human-AI Musical Co-Creativity
by: Bradshaw, Louis, et al.
Published: (2025)
by: Bradshaw, Louis, et al.
Published: (2025)
Attribution-by-design: Ensuring Inference-Time Provenance in Generative Music Systems
by: Morreale, Fabio, et al.
Published: (2025)
by: Morreale, Fabio, et al.
Published: (2025)
Live Music Models
by: Lyria Team, et al.
Published: (2025)
by: Lyria Team, et al.
Published: (2025)
Apollo: An Interactive Environment for Generating Symbolic Musical Phrases using Corpus-based Style Imitation
by: Tchemeube, Renaud Bougueng, et al.
Published: (2025)
by: Tchemeube, Renaud Bougueng, et al.
Published: (2025)
Calliope: An Online Generative Music System for Symbolic Multi-Track Composition
by: Tchemeube, Renaud Bougueng, et al.
Published: (2025)
by: Tchemeube, Renaud Bougueng, et al.
Published: (2025)
EEG-SSM: Leveraging State-Space Model for Dementia Detection
by: Tran, Xuan-The, et al.
Published: (2024)
by: Tran, Xuan-The, et al.
Published: (2024)
A Cloud-Based Cross-Modal Transformer for Emotion Recognition and Adaptive Human-Computer Interaction
by: Zhong, Ziwen, et al.
Published: (2025)
by: Zhong, Ziwen, et al.
Published: (2025)
Deep Learning Models in Speech Recognition: Measuring GPU Energy Consumption, Impact of Noise and Model Quantization for Edge Deployment
by: Chakravarty, Aditya
Published: (2024)
by: Chakravarty, Aditya
Published: (2024)
Enabling On-Device LLMs Personalization with Smartphone Sensing
by: Zhang, Shiquan, et al.
Published: (2024)
by: Zhang, Shiquan, et al.
Published: (2024)
Proceedings of The third international workshop on eXplainable AI for the Arts (XAIxArts)
by: Ford, Corey, et al.
Published: (2025)
by: Ford, Corey, et al.
Published: (2025)
Predicting Trust In Autonomous Vehicles: Modeling Young Adult Psychosocial Traits, Risk-Benefit Attitudes, And Driving Factors With Machine Learning
by: Kaufman, Robert, et al.
Published: (2024)
by: Kaufman, Robert, et al.
Published: (2024)
Benchmarking Mobile Device Control Agents across Diverse Configurations
by: Lee, Juyong, et al.
Published: (2024)
by: Lee, Juyong, et al.
Published: (2024)
Stress Detection on Code-Mixed Texts in Dravidian Languages using Machine Learning
by: Ramos, L., et al.
Published: (2024)
by: Ramos, L., et al.
Published: (2024)
Do AI Voices Learn Social Nuances? A Case of Politeness and Speech Rate
by: Rabin, Eyal, et al.
Published: (2025)
by: Rabin, Eyal, et al.
Published: (2025)
Similar Items
-
Emotion-Disentangled Embedding Alignment for Noise-Robust and Cross-Corpus Speech Emotion Recognition
by: Tiwari, Upasana, et al.
Published: (2025) -
Voice "Cloning" is Style Transfer
by: Zhou, Kaitlyn, et al.
Published: (2026) -
Evaluating Human-AI Interaction via Usability, User Experience and Acceptance Measures for MMM-C: A Creative AI System for Music Composition
by: Tchemeube, Renaud Bougueng, et al.
Published: (2025) -
What Does it Take to Generalize SER Model Across Datasets? A Comprehensive Benchmark
by: Ibrahim, Adham, et al.
Published: (2024) -
Plural Voices, Single Agent: Towards Inclusive AI in Multi-User Domestic Spaces
by: Chandra, Joydeep, et al.
Published: (2025)