:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Muhammad, Imran, Schuller, Gerald
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Audio and Speech Processing
Online-Zugang:	https://arxiv.org/abs/2509.24769
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Room Impulse Response Prediction with Neural Networks: From Energy Decay Curves to Perceptual Validation
von: Muhammad, Imran, et al.
Veröffentlicht: (2025)

3D Room Geometry Inference from Multichannel Room Impulse Response using Deep Neural Network
von: Yeon, Inmo, et al.
Veröffentlicht: (2024)

Deep Room Impulse Response Completion
von: Lin, Jackie, et al.
Veröffentlicht: (2024)

Multimodal Deep Learning Method for Real-Time Spatial Room Impulse Response Computing
von: Li, Zhiyu, et al.
Veröffentlicht: (2026)

AttentiveMOS: A Lightweight Attention-Only Model for Speech Quality Prediction
von: Kibria, Imran E, et al.
Veröffentlicht: (2024)

Predicting Global HRTFs From Scanned Head Geometry Using Deep Learning and Compact Representations
von: Wang, Yuxiang, et al.
Veröffentlicht: (2022)

Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers
von: Latif, Siddique, et al.
Veröffentlicht: (2023)

Curved Worlds, Clear Boundaries: Generalizing Speech Deepfake Detection using Hyperbolic and Spherical Geometry Spaces
von: Sheth, Farhan, et al.
Veröffentlicht: (2025)

Quantifying Dimensional Independence in Speech: An Information-Theoretic Framework for Disentangled Representation Learning
von: Kashyap, Bipasha, et al.
Veröffentlicht: (2026)

EchoScan: Scanning Complex Room Geometries via Acoustic Echoes
von: Yeon, Inmo, et al.
Veröffentlicht: (2023)

Domain Adapting Deep Reinforcement Learning for Real-world Speech Emotion Recognition
von: Rajapakshe, Thejan, et al.
Veröffentlicht: (2022)

Learning Filters in Feedback Delay Networks from Noisy Room Impulse Responses
von: Santo, Gloria Dal, et al.
Veröffentlicht: (2025)

Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models
von: Jing, Xin, et al.
Veröffentlicht: (2024)

ParaCLAP -- Towards a general language-audio model for computational paralinguistic tasks
von: Jing, Xin, et al.
Veröffentlicht: (2024)

Room Impulse Responses help attackers to evade Deep Fake Detection
von: Luong, Hieu-Thi, et al.
Veröffentlicht: (2024)

Room Impulse Response Completion Using Signal-Prediction Diffusion Models Conditioned on Simulated Early Reflections
von: Xu, Zeyu, et al.
Veröffentlicht: (2026)

Charting 15 years of progress in deep learning for speech emotion recognition: A replication study
von: Triantafyllopoulos, Andreas, et al.
Veröffentlicht: (2025)

Computer Audition: From Task-Specific Machine Learning to Foundation Models
von: Triantafyllopoulos, Andreas, et al.
Veröffentlicht: (2024)

RGI-Net: 3D Room Geometry Inference from Room Impulse Responses With Hidden First-Order Reflections
von: Yeon, Inmo, et al.
Veröffentlicht: (2023)

A Comprehensive Survey on Heart Sound Analysis in the Deep Learning Era
von: Ren, Zhao, et al.
Veröffentlicht: (2023)

Low-Rank Adaptation of Deep Prior Neural Networks For Room Impulse Response Reconstruction
von: Pezzoli, Mirco, et al.
Veröffentlicht: (2025)

Audio-based Step-count Estimation for Running -- Windowing and Neural Network Baselines
von: Wagner, Philipp, et al.
Veröffentlicht: (2024)

An Adaptive Method for Target Curve Selection
von: Ravizza, Gabriele, et al.
Veröffentlicht: (2025)

Blind Identification of Binaural Room Impulse Responses from Smart Glasses
von: Deppisch, Thomas, et al.
Veröffentlicht: (2024)

Cross-Dialect Bird Species Recognition with Dialect-Calibrated Augmentation
von: Ding, Jiani, et al.
Veröffentlicht: (2025)

From Audio Deepfake Detection to AI-Generated Music Detection -- A Pathway and Overview
von: Li, Yupei, et al.
Veröffentlicht: (2024)

Intelligent Cardiac Auscultation for Murmur Detection via Parallel-Attentive Models with Uncertainty Estimation
von: Zhang, Zixing, et al.
Veröffentlicht: (2024)

Discovering and Causally Validating Emotion-Sensitive Neurons in Large Audio-Language Models
von: Zhao, Xiutian, et al.
Veröffentlicht: (2026)

Room compensation for loudspeaker reproduction using a supporting source
von: Brooks-Park, James, et al.
Veröffentlicht: (2026)

DARAS: Dynamic Audio-Room Acoustic Synthesis for Blind Room Impulse Response Estimation
von: Wang, Chunxi, et al.
Veröffentlicht: (2025)

Abusive Speech Detection in Indic Languages Using Acoustic Features
von: Spiesberger, Anika A., et al.
Veröffentlicht: (2024)

An automatic analysis of ultrasound vocalisations for the prediction of interaction context in captive Egyptian fruit bats
von: Triantafyllopoulos, Andreas, et al.
Veröffentlicht: (2024)

autrainer: A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks
von: Rampp, Simon, et al.
Veröffentlicht: (2024)

Adapting a Text-to-Audio Model for Room Impulse Response Generation
von: Kim, Kirak, et al.
Veröffentlicht: (2026)

Exploring the Power of Pure Attention Mechanisms in Blind Room Parameter Estimation
von: Wang, Chunxi, et al.
Veröffentlicht: (2024)

State-Space Estimation of Spatially Dynamic Room Impulse Responses using a Room Acoustic Model-based Prior
von: MacWilliam, Kathleen, et al.
Veröffentlicht: (2024)

Multiple Speaker Separation from Noisy Sources in Reverberant Rooms using Relative Transfer Matrix
von: Manamperi, Wageesha N., et al.
Veröffentlicht: (2025)

Explainable Detection of Machine Generated Music and Early Systematic Evaluation
von: Li, Yupei, et al.
Veröffentlicht: (2024)

AnyRIR: Robust Non-intrusive Room Impulse Response Estimation in the Wild
von: Lee, Kyung Yun, et al.
Veröffentlicht: (2025)

StreamMark: A Deep Learning-Based Semi-Fragile Audio Watermarking for Proactive Deepfake Detection
von: Liu, Zhentao, et al.
Veröffentlicht: (2026)