:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Kotowski, Błażej, Evans, Nicholas, Haki, Behzad, Font, Frederic, Jordà, Sergi
Formato:	Preprint
Publicado:	2025
Materias:	Human-Computer Interaction Artificial Intelligence Sound Audio and Speech Processing
Acceso en línea:	https://arxiv.org/abs/2509.05145
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

The language of sound search: Examining User Queries in Audio Search Engines
por: Weck, Benno, et al.
Publicado: (2024)

Real-time Generation of Various Types of Nodding for Avatar Attentive Listening System
por: Kato, Kazushi, et al.
Publicado: (2025)

SoundShift: Exploring Sound Manipulations for Accessible Mixed-Reality Awareness
por: Chang, Ruei-Che, et al.
Publicado: (2024)

USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
por: Yu, Luca Jiang-Tao, et al.
Publicado: (2024)

Advancing User-Voice Interaction: Exploring Emotion-Aware Voice Assistants Through a Role-Swapping Approach
por: Ma, Yong, et al.
Publicado: (2025)

Open vocabulary keyword spotting through transfer learning from speech synthesis
por: V, Kesavaraj, et al.
Publicado: (2024)

FeatureSense: Protecting Speaker Attributes in Always-On Audio Sensing System
por: Chhaglani, Bhawana, et al.
Publicado: (2025)

Teach Me How to ImproVISe: Co-Designing an Augmented Piano Training System for Improvisation
por: Deja, Jordan Aiko, et al.
Publicado: (2024)

Sound2Hap: Learning Audio-to-Vibrotactile Haptic Generation from Human Ratings
por: Li, Yinan, et al.
Publicado: (2026)

Lla-VAP: LSTM Ensemble of Llama and VAP for Turn-Taking Prediction
por: Jeon, Hyunbae, et al.
Publicado: (2024)

Capturing Cancer as Music: Cancer Mechanisms Expressed through Musification
por: Hnatyshyn, Rostyslav, et al.
Publicado: (2024)

Interactive Melody Generation System for Enhancing the Creativity of Musicians
por: Hirawata, So, et al.
Publicado: (2024)

From Qubits to Rhythm: Exploring Quantum Random Walks in Rhythmspaces
por: Aguado-Yáñez, María, et al.
Publicado: (2025)

NeuroIncept Decoder for High-Fidelity Speech Reconstruction from Neural Activity
por: Khanday, Owais Mujtaba, et al.
Publicado: (2025)

Recreating Neural Activity During Speech Production with Language and Speech Model Embeddings
por: Khanday, Owais Mujtaba, et al.
Publicado: (2025)

A Mapping Strategy for Interacting with Latent Audio Synthesis Using Artistic Materials
por: Zheng, Shuoyang, et al.
Publicado: (2024)

Enhancing DMI Interactions by Integrating Haptic Feedback for Intricate Vibrato Technique
por: Piao, Ziyue, et al.
Publicado: (2024)

Towards Temporally Explainable Dysarthric Speech Clarity Assessment
por: Park, Seohyun, et al.
Publicado: (2025)

A cross-talk robust multichannel VAD model for multiparty agent interactions trained using synthetic re-recordings
por: Han, Hyewon, et al.
Publicado: (2024)

Interactive Sonification for Health and Energy using ChucK and Unity
por: Zhao, Yichun, et al.
Publicado: (2024)

A Near-Real-Time Processing Ego Speech Filtering Pipeline Designed for Speech Interruption During Human-Robot Interaction
por: Li, Yue, et al.
Publicado: (2024)

Seeing Beyond Sound: Visualization and Abstraction in Audio Data Representation
por: Blum'e, Ashlae
Publicado: (2025)

Early Detection of Furniture-Infesting Wood-Boring Beetles Using CNN-LSTM Networks and MFCC-Based Acoustic Features
por: Manukalpa, J. M. Chan Sri, et al.
Publicado: (2025)

Interfacing with history: Curating with audio augmented objects
por: Cliffe, Laurence
Publicado: (2024)

Transhuman Ansambl - Voice Beyond Language
por: Ivsic, Lucija, et al.
Publicado: (2024)

How Private is Low-Frequency Speech Audio in the Wild? An Analysis of Verbal Intelligibility by Humans and Machines
por: Liu, Ailin, et al.
Publicado: (2024)

Cervical Auscultation Machine Learning for Dysphagia Assessment
por: Chia, An An, et al.
Publicado: (2024)

ExSampling: a system for the real-time ensemble performance of field-recorded environmental sounds
por: Kobayashi, Atsuya, et al.
Publicado: (2020)

Adapting Whisper for Lightweight and Efficient Automatic Speech Recognition of Children for On-device Edge Applications
por: Dutta, Satwik, et al.
Publicado: (2025)

BioSonix: Can Physics-Based Sonification Perceptualize Tissue Deformations From Tool Interactions?
por: Ruozzi, Veronica, et al.
Publicado: (2025)

Springboard, Roadblock or "Crutch"?: How Transgender Users Leverage Voice Changers for Gender Presentation in Social Virtual Reality
por: Povinelli, Kassie, et al.
Publicado: (2024)

Evolving Performance Practices in Beethoven's Cello Sonatas: Tempo, Portamento, and Historical Interpretation of the First Movements
por: Sole, Ignasi
Publicado: (2025)

SCDiar: a streaming diarization system based on speaker change detection and speech recognition
por: Zheng, Naijun, et al.
Publicado: (2025)

The effect of self-motion and room familiarity on sound source localization in virtual environments
por: Isserstedt, Niklas, et al.
Publicado: (2024)

NeckCare: Preventing Tech Neck using Hearable-based Multimodal Sensing
por: Chhaglani, Bhawana, et al.
Publicado: (2024)

Hidden bawls, whispers, and yelps: can text be made to sound more than just its words?
por: Pataca, Caluã de Lacerda, et al.
Publicado: (2022)

Optimizing Dysarthria Wake-Up Word Spotting: An End-to-End Approach for SLT 2024 LRDWWS Challenge
por: Liu, Shuiyun, et al.
Publicado: (2024)

SACM: SEEG-Audio Contrastive Matching for Chinese Speech Decoding
por: Wang, Hongbin, et al.
Publicado: (2025)

Towards LLM-Empowered Fine-Grained Speech Descriptors for Explainable Emotion Recognition
por: Chen, Youjun, et al.
Publicado: (2025)

Beyond-Voice: Towards Continuous 3D Hand Pose Tracking on Commercial Home Assistant Devices
por: Li, Yin, et al.
Publicado: (2023)