:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Singh, Vishwa Mohan, Aryasomayajula, Sai Anirudh, Chatterjee, Ahan, Aydemir, Beste, Amin, Rifat Mehreen
Format:	Preprint
Published:	2025
Subjects:	Sound Human-Computer Interaction Machine Learning Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2506.04852
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A cross-talk robust multichannel VAD model for multiparty agent interactions trained using synthetic re-recordings
by: Han, Hyewon, et al.
Published: (2024)

Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment
by: Zhao, Zhixian, et al.
Published: (2024)

A Framework for AI assisted Musical Devices
by: Civit, Miguel, et al.
Published: (2024)

Using Confidence Scores to Improve Eyes-free Detection of Speech Recognition Errors
by: Nowrin, Sadia, et al.
Published: (2024)

Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
by: Mishra, Ruchik, et al.
Published: (2024)

Enhancing AAC Software for Dysarthric Speakers in e-Health Settings: An Evaluation Using TORGO
by: Hui, Macarious, et al.
Published: (2024)

InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training
by: Wang, Dingdong, et al.
Published: (2025)

Revisiting Your Memory: Reconstruction of Affect-Contextualized Memory via EEG-guided Audiovisual Generation
by: Kwon, Joonwoo, et al.
Published: (2024)

NeuroIncept Decoder for High-Fidelity Speech Reconstruction from Neural Activity
by: Khanday, Owais Mujtaba, et al.
Published: (2025)

Recreating Neural Activity During Speech Production with Language and Speech Model Embeddings
by: Khanday, Owais Mujtaba, et al.
Published: (2025)

Sound2Hap: Learning Audio-to-Vibrotactile Haptic Generation from Human Ratings
by: Li, Yinan, et al.
Published: (2026)

A Mapping Strategy for Interacting with Latent Audio Synthesis Using Artistic Materials
by: Zheng, Shuoyang, et al.
Published: (2024)

Enhancing DMI Interactions by Integrating Haptic Feedback for Intricate Vibrato Technique
by: Piao, Ziyue, et al.
Published: (2024)

Towards Temporally Explainable Dysarthric Speech Clarity Assessment
by: Park, Seohyun, et al.
Published: (2025)

Interactive Sonification for Health and Energy using ChucK and Unity
by: Zhao, Yichun, et al.
Published: (2024)

A Near-Real-Time Processing Ego Speech Filtering Pipeline Designed for Speech Interruption During Human-Robot Interaction
by: Li, Yue, et al.
Published: (2024)

USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
by: Yu, Luca Jiang-Tao, et al.
Published: (2024)

Seeing Beyond Sound: Visualization and Abstraction in Audio Data Representation
by: Blum'e, Ashlae
Published: (2025)

Early Detection of Furniture-Infesting Wood-Boring Beetles Using CNN-LSTM Networks and MFCC-Based Acoustic Features
by: Manukalpa, J. M. Chan Sri, et al.
Published: (2025)

Interfacing with history: Curating with audio augmented objects
by: Cliffe, Laurence
Published: (2024)

Transhuman Ansambl - Voice Beyond Language
by: Ivsic, Lucija, et al.
Published: (2024)

How Private is Low-Frequency Speech Audio in the Wild? An Analysis of Verbal Intelligibility by Humans and Machines
by: Liu, Ailin, et al.
Published: (2024)

Cervical Auscultation Machine Learning for Dysphagia Assessment
by: Chia, An An, et al.
Published: (2024)

ExSampling: a system for the real-time ensemble performance of field-recorded environmental sounds
by: Kobayashi, Atsuya, et al.
Published: (2020)

Adapting Whisper for Lightweight and Efficient Automatic Speech Recognition of Children for On-device Edge Applications
by: Dutta, Satwik, et al.
Published: (2025)

Real-time Generation of Various Types of Nodding for Avatar Attentive Listening System
by: Kato, Kazushi, et al.
Published: (2025)

BioSonix: Can Physics-Based Sonification Perceptualize Tissue Deformations From Tool Interactions?
by: Ruozzi, Veronica, et al.
Published: (2025)

Springboard, Roadblock or "Crutch"?: How Transgender Users Leverage Voice Changers for Gender Presentation in Social Virtual Reality
by: Povinelli, Kassie, et al.
Published: (2024)

Evolving Performance Practices in Beethoven's Cello Sonatas: Tempo, Portamento, and Historical Interpretation of the First Movements
by: Sole, Ignasi
Published: (2025)

SCDiar: a streaming diarization system based on speaker change detection and speech recognition
by: Zheng, Naijun, et al.
Published: (2025)

Teach Me How to ImproVISe: Co-Designing an Augmented Piano Training System for Improvisation
by: Deja, Jordan Aiko, et al.
Published: (2024)

Open vocabulary keyword spotting through transfer learning from speech synthesis
by: V, Kesavaraj, et al.
Published: (2024)

The effect of self-motion and room familiarity on sound source localization in virtual environments
by: Isserstedt, Niklas, et al.
Published: (2024)

NeckCare: Preventing Tech Neck using Hearable-based Multimodal Sensing
by: Chhaglani, Bhawana, et al.
Published: (2024)

Hidden bawls, whispers, and yelps: can text be made to sound more than just its words?
by: Pataca, Caluã de Lacerda, et al.
Published: (2022)

Optimizing Dysarthria Wake-Up Word Spotting: An End-to-End Approach for SLT 2024 LRDWWS Challenge
by: Liu, Shuiyun, et al.
Published: (2024)

SACM: SEEG-Audio Contrastive Matching for Chinese Speech Decoding
by: Wang, Hongbin, et al.
Published: (2025)

Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition
by: Chen, Youjun, et al.
Published: (2025)

Beyond-Voice: Towards Continuous 3D Hand Pose Tracking on Commercial Home Assistant Devices
by: Li, Yin, et al.
Published: (2023)

WSCoach: Wearable Real-time Auditory Feedback for Reducing Unwanted Words in Daily Communication
by: Youpeng, Zhang, et al.
Published: (2025)