Saved in:
| Main Authors: | Singh, Vishwa Mohan, Aryasomayajula, Sai Anirudh, Chatterjee, Ahan, Aydemir, Beste, Amin, Rifat Mehreen |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.04852 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A cross-talk robust multichannel VAD model for multiparty agent interactions trained using synthetic re-recordings
by: Han, Hyewon, et al.
Published: (2024)
by: Han, Hyewon, et al.
Published: (2024)
Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment
by: Zhao, Zhixian, et al.
Published: (2024)
by: Zhao, Zhixian, et al.
Published: (2024)
A Framework for AI assisted Musical Devices
by: Civit, Miguel, et al.
Published: (2024)
by: Civit, Miguel, et al.
Published: (2024)
Using Confidence Scores to Improve Eyes-free Detection of Speech Recognition Errors
by: Nowrin, Sadia, et al.
Published: (2024)
by: Nowrin, Sadia, et al.
Published: (2024)
Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
by: Mishra, Ruchik, et al.
Published: (2024)
by: Mishra, Ruchik, et al.
Published: (2024)
Enhancing AAC Software for Dysarthric Speakers in e-Health Settings: An Evaluation Using TORGO
by: Hui, Macarious, et al.
Published: (2024)
by: Hui, Macarious, et al.
Published: (2024)
InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training
by: Wang, Dingdong, et al.
Published: (2025)
by: Wang, Dingdong, et al.
Published: (2025)
Revisiting Your Memory: Reconstruction of Affect-Contextualized Memory via EEG-guided Audiovisual Generation
by: Kwon, Joonwoo, et al.
Published: (2024)
by: Kwon, Joonwoo, et al.
Published: (2024)
NeuroIncept Decoder for High-Fidelity Speech Reconstruction from Neural Activity
by: Khanday, Owais Mujtaba, et al.
Published: (2025)
by: Khanday, Owais Mujtaba, et al.
Published: (2025)
Recreating Neural Activity During Speech Production with Language and Speech Model Embeddings
by: Khanday, Owais Mujtaba, et al.
Published: (2025)
by: Khanday, Owais Mujtaba, et al.
Published: (2025)
Sound2Hap: Learning Audio-to-Vibrotactile Haptic Generation from Human Ratings
by: Li, Yinan, et al.
Published: (2026)
by: Li, Yinan, et al.
Published: (2026)
A Mapping Strategy for Interacting with Latent Audio Synthesis Using Artistic Materials
by: Zheng, Shuoyang, et al.
Published: (2024)
by: Zheng, Shuoyang, et al.
Published: (2024)
Enhancing DMI Interactions by Integrating Haptic Feedback for Intricate Vibrato Technique
by: Piao, Ziyue, et al.
Published: (2024)
by: Piao, Ziyue, et al.
Published: (2024)
Towards Temporally Explainable Dysarthric Speech Clarity Assessment
by: Park, Seohyun, et al.
Published: (2025)
by: Park, Seohyun, et al.
Published: (2025)
Interactive Sonification for Health and Energy using ChucK and Unity
by: Zhao, Yichun, et al.
Published: (2024)
by: Zhao, Yichun, et al.
Published: (2024)
A Near-Real-Time Processing Ego Speech Filtering Pipeline Designed for Speech Interruption During Human-Robot Interaction
by: Li, Yue, et al.
Published: (2024)
by: Li, Yue, et al.
Published: (2024)
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
by: Yu, Luca Jiang-Tao, et al.
Published: (2024)
by: Yu, Luca Jiang-Tao, et al.
Published: (2024)
Seeing Beyond Sound: Visualization and Abstraction in Audio Data Representation
by: Blum'e, Ashlae
Published: (2025)
by: Blum'e, Ashlae
Published: (2025)
Early Detection of Furniture-Infesting Wood-Boring Beetles Using CNN-LSTM Networks and MFCC-Based Acoustic Features
by: Manukalpa, J. M. Chan Sri, et al.
Published: (2025)
by: Manukalpa, J. M. Chan Sri, et al.
Published: (2025)
Interfacing with history: Curating with audio augmented objects
by: Cliffe, Laurence
Published: (2024)
by: Cliffe, Laurence
Published: (2024)
Transhuman Ansambl - Voice Beyond Language
by: Ivsic, Lucija, et al.
Published: (2024)
by: Ivsic, Lucija, et al.
Published: (2024)
How Private is Low-Frequency Speech Audio in the Wild? An Analysis of Verbal Intelligibility by Humans and Machines
by: Liu, Ailin, et al.
Published: (2024)
by: Liu, Ailin, et al.
Published: (2024)
Cervical Auscultation Machine Learning for Dysphagia Assessment
by: Chia, An An, et al.
Published: (2024)
by: Chia, An An, et al.
Published: (2024)
ExSampling: a system for the real-time ensemble performance of field-recorded environmental sounds
by: Kobayashi, Atsuya, et al.
Published: (2020)
by: Kobayashi, Atsuya, et al.
Published: (2020)
Adapting Whisper for Lightweight and Efficient Automatic Speech Recognition of Children for On-device Edge Applications
by: Dutta, Satwik, et al.
Published: (2025)
by: Dutta, Satwik, et al.
Published: (2025)
Real-time Generation of Various Types of Nodding for Avatar Attentive Listening System
by: Kato, Kazushi, et al.
Published: (2025)
by: Kato, Kazushi, et al.
Published: (2025)
BioSonix: Can Physics-Based Sonification Perceptualize Tissue Deformations From Tool Interactions?
by: Ruozzi, Veronica, et al.
Published: (2025)
by: Ruozzi, Veronica, et al.
Published: (2025)
Springboard, Roadblock or "Crutch"?: How Transgender Users Leverage Voice Changers for Gender Presentation in Social Virtual Reality
by: Povinelli, Kassie, et al.
Published: (2024)
by: Povinelli, Kassie, et al.
Published: (2024)
Evolving Performance Practices in Beethoven's Cello Sonatas: Tempo, Portamento, and Historical Interpretation of the First Movements
by: Sole, Ignasi
Published: (2025)
by: Sole, Ignasi
Published: (2025)
SCDiar: a streaming diarization system based on speaker change detection and speech recognition
by: Zheng, Naijun, et al.
Published: (2025)
by: Zheng, Naijun, et al.
Published: (2025)
Teach Me How to ImproVISe: Co-Designing an Augmented Piano Training System for Improvisation
by: Deja, Jordan Aiko, et al.
Published: (2024)
by: Deja, Jordan Aiko, et al.
Published: (2024)
Open vocabulary keyword spotting through transfer learning from speech synthesis
by: V, Kesavaraj, et al.
Published: (2024)
by: V, Kesavaraj, et al.
Published: (2024)
The effect of self-motion and room familiarity on sound source localization in virtual environments
by: Isserstedt, Niklas, et al.
Published: (2024)
by: Isserstedt, Niklas, et al.
Published: (2024)
NeckCare: Preventing Tech Neck using Hearable-based Multimodal Sensing
by: Chhaglani, Bhawana, et al.
Published: (2024)
by: Chhaglani, Bhawana, et al.
Published: (2024)
Hidden bawls, whispers, and yelps: can text be made to sound more than just its words?
by: Pataca, Caluã de Lacerda, et al.
Published: (2022)
by: Pataca, Caluã de Lacerda, et al.
Published: (2022)
Optimizing Dysarthria Wake-Up Word Spotting: An End-to-End Approach for SLT 2024 LRDWWS Challenge
by: Liu, Shuiyun, et al.
Published: (2024)
by: Liu, Shuiyun, et al.
Published: (2024)
SACM: SEEG-Audio Contrastive Matching for Chinese Speech Decoding
by: Wang, Hongbin, et al.
Published: (2025)
by: Wang, Hongbin, et al.
Published: (2025)
Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition
by: Chen, Youjun, et al.
Published: (2025)
by: Chen, Youjun, et al.
Published: (2025)
Beyond-Voice: Towards Continuous 3D Hand Pose Tracking on Commercial Home Assistant Devices
by: Li, Yin, et al.
Published: (2023)
by: Li, Yin, et al.
Published: (2023)
WSCoach: Wearable Real-time Auditory Feedback for Reducing Unwanted Words in Daily Communication
by: Youpeng, Zhang, et al.
Published: (2025)
by: Youpeng, Zhang, et al.
Published: (2025)
Similar Items
-
A cross-talk robust multichannel VAD model for multiparty agent interactions trained using synthetic re-recordings
by: Han, Hyewon, et al.
Published: (2024) -
Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment
by: Zhao, Zhixian, et al.
Published: (2024) -
A Framework for AI assisted Musical Devices
by: Civit, Miguel, et al.
Published: (2024) -
Using Confidence Scores to Improve Eyes-free Detection of Speech Recognition Errors
by: Nowrin, Sadia, et al.
Published: (2024) -
Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
by: Mishra, Ruchik, et al.
Published: (2024)