:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Chandra, Joydeep
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence Human-Computer Interaction Sound
Online Access:	https://arxiv.org/abs/2605.03039
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Emotion-Disentangled Embedding Alignment for Noise-Robust and Cross-Corpus Speech Emotion Recognition
by: Tiwari, Upasana, et al.
Published: (2025)

Voice "Cloning" is Style Transfer
by: Zhou, Kaitlyn, et al.
Published: (2026)

Evaluating Human-AI Interaction via Usability, User Experience and Acceptance Measures for MMM-C: A Creative AI System for Music Composition
by: Tchemeube, Renaud Bougueng, et al.
Published: (2025)

What Does it Take to Generalize SER Model Across Datasets? A Comprehensive Benchmark
by: Ibrahim, Adham, et al.
Published: (2024)

Plural Voices, Single Agent: Towards Inclusive AI in Multi-User Domestic Spaces
by: Chandra, Joydeep, et al.
Published: (2025)

Arabic Little STT: Arabic Children Speech Recognition Dataset
by: Alkadri, Mouhand, et al.
Published: (2025)

Morse Code-Enabled Speech Recognition for Individuals with Visual and Hearing Impairments
by: Choudhury, Ritabrata Roy
Published: (2024)

EmoAugNet: A Signal-Augmented Hybrid CNN-LSTM Framework for Speech Emotion Recognition
by: Paul, Durjoy Chandra, et al.
Published: (2025)

Adaptation and Optimization of Automatic Speech Recognition (ASR) for the Maritime Domain in the Field of VHF Communication
by: Nakilcioglu, Emin Cagatay, et al.
Published: (2023)

Layer-Wise Analysis of Self-Supervised Representations for Age and Gender Classification in Children's Speech
by: Sinha, Abhijit, et al.
Published: (2025)

Composers' Evaluations of an AI Music Tool: Insights for Human-Centred Design
by: Row, Eleanor, et al.
Published: (2024)

Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
by: Geng, Mengzhe, et al.
Published: (2024)

Music Generation using Human-In-The-Loop Reinforcement Learning
by: Justus, Aju Ani
Published: (2025)

Learning Relationships Between Separate Audio Tracks for Creative Applications
by: Bujard, Balthazar, et al.
Published: (2025)

Evaluating Co-Creativity using Total Information Flow
by: Gokul, Vignesh, et al.
Published: (2024)

Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
by: Xie, Zhifei, et al.
Published: (2024)

Call2Instruct: Automated Pipeline for Generating Q&A Datasets from Call Center Recordings for LLM Fine-Tuning
by: Echeverria, Alex, et al.
Published: (2025)

Clip-TTS: Contrastive Text-content and Mel-spectrogram, A High-Quality Text-to-Speech Method based on Contextual Semantic Understanding
by: Liu, Tianyun
Published: (2025)

Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese
by: Wang, Xihuai, et al.
Published: (2025)

EmoHeal: An End-to-End System for Personalized Therapeutic Music Retrieval from Fine-grained Emotions
by: Wan, Xinchen, et al.
Published: (2025)

Orchestrating Attention: Bringing Harmony to the 'Chaos' of Neurodivergent Learning States
by: Navneet, Satyam Kumar, et al.
Published: (2026)

An Intelligent AI glasses System with Multi-Agent Architecture for Real-Time Voice Processing and Task Execution
by: Chen, Sheng-Kai, et al.
Published: (2026)

Same Words, Different Judgments: How Preferences Vary Across Modalities
by: Broukhim, Aaron, et al.
Published: (2026)

Echoes of Humanity: Exploring the Perceived Humanness of AI Music
by: Figueiredo, Flavio, et al.
Published: (2025)

BREATH: A Bio-Radar Embodied Agent for Tonal and Human-Aware Diffusion Music Generation
by: Wang, Yunzhe, et al.
Published: (2025)

Opening Musical Creativity? Embedded Ideologies in Generative-AI Music Systems
by: Pram, Liam, et al.
Published: (2025)

The Ghost in the Keys: A Disklavier Demo for Human-AI Musical Co-Creativity
by: Bradshaw, Louis, et al.
Published: (2025)

Attribution-by-design: Ensuring Inference-Time Provenance in Generative Music Systems
by: Morreale, Fabio, et al.
Published: (2025)

Live Music Models
by: Lyria Team, et al.
Published: (2025)

Apollo: An Interactive Environment for Generating Symbolic Musical Phrases using Corpus-based Style Imitation
by: Tchemeube, Renaud Bougueng, et al.
Published: (2025)

Calliope: An Online Generative Music System for Symbolic Multi-Track Composition
by: Tchemeube, Renaud Bougueng, et al.
Published: (2025)

EEG-SSM: Leveraging State-Space Model for Dementia Detection
by: Tran, Xuan-The, et al.
Published: (2024)

A Cloud-Based Cross-Modal Transformer for Emotion Recognition and Adaptive Human-Computer Interaction
by: Zhong, Ziwen, et al.
Published: (2025)

Deep Learning Models in Speech Recognition: Measuring GPU Energy Consumption, Impact of Noise and Model Quantization for Edge Deployment
by: Chakravarty, Aditya
Published: (2024)

Enabling On-Device LLMs Personalization with Smartphone Sensing
by: Zhang, Shiquan, et al.
Published: (2024)

Proceedings of The third international workshop on eXplainable AI for the Arts (XAIxArts)
by: Ford, Corey, et al.
Published: (2025)

Predicting Trust In Autonomous Vehicles: Modeling Young Adult Psychosocial Traits, Risk-Benefit Attitudes, And Driving Factors With Machine Learning
by: Kaufman, Robert, et al.
Published: (2024)

Benchmarking Mobile Device Control Agents across Diverse Configurations
by: Lee, Juyong, et al.
Published: (2024)

Stress Detection on Code-Mixed Texts in Dravidian Languages using Machine Learning
by: Ramos, L., et al.
Published: (2024)

Do AI Voices Learn Social Nuances? A Case of Politeness and Speech Rate
by: Rabin, Eyal, et al.
Published: (2025)