:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Vercammen, Charlotte, Heinrich, Antje, Lesimple, Christophe, Paglialonga, Alessia, Wasmann, Jan-Willem A., Buhl, Mareike
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Sound Audio and Speech Processing Medical Physics
Online-Zugang:	https://arxiv.org/abs/2505.04728
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Discrimination loss vs. SRT: A model-based approach towards harmonizing speech test interpretations
von: Buhl, Mareike, et al.
Veröffentlicht: (2025)

Integrating audiological datasets via federated merging of Auditory Profiles
von: Saak, Samira, et al.
Veröffentlicht: (2024)

The UmboMic: A PVDF Cantilever Microphone
von: Yeiser, Aaron J., et al.
Veröffentlicht: (2023)

An Implantable Piezofilm Middle Ear Microphone: Performance in Human Cadaveric Temporal Bones
von: Zhang, John Z., et al.
Veröffentlicht: (2023)

On the relevance of acoustic measurements for creating realistic virtual acoustic environments
von: Gündert, Siegfried, et al.
Veröffentlicht: (2023)

Standardized Evaluation of Fetal Phonocardiography Processing Methods
von: Müller, Kristóf, et al.
Veröffentlicht: (2025)

Quantization-Based Score Calibration for Few-Shot Keyword Spotting with Dynamic Time Warping in Noisy Environments
von: Wilkinghoff, Kevin, et al.
Veröffentlicht: (2025)

Neural Speech Tracking in a Virtual Acoustic Environment: Audio-Visual Benefit for Unscripted Continuous Speech
von: Daeglau, Mareike, et al.
Veröffentlicht: (2025)

SALT: Standardized Audio event Label Taxonomy
von: Stamatiadis, Paraskevas, et al.
Veröffentlicht: (2024)

Timbre Perception, Representation, and its Neuroscientific Exploration: A Comprehensive Review
von: Zhang, Hong, et al.
Veröffentlicht: (2024)

Diff-MST: Differentiable Mixing Style Transfer
von: Vanka, Soumya Sai, et al.
Veröffentlicht: (2024)

DAC-JAX: A JAX Implementation of the Descript Audio Codec
von: Braun, David
Veröffentlicht: (2024)

Multi-Sample Dynamic Time Warping for Few-Shot Keyword Spotting
von: Wilkinghoff, Kevin, et al.
Veröffentlicht: (2024)

IQRA 2026: Interspeech Challenge on Automatic Pronunciation Assessment for Modern Standard Arabic (MSA)
von: Kheir, Yassine El, et al.
Veröffentlicht: (2026)

Audio Generation Through Score-Based Generative Modeling: Design Principles and Implementation
von: Zhu, Ge, et al.
Veröffentlicht: (2025)

Implementation and Applications of WakeWords Integrated with Speaker Recognition: A Case Study
von: Filho, Alexandre Costa Ferro, et al.
Veröffentlicht: (2024)

Diff-MSTC: A Mixing Style Transfer Prototype for Cubase
von: Vanka, Soumya Sai, et al.
Veröffentlicht: (2024)

Pitch Contour Exploration Across Audio Domains: A Vision-Based Transfer Learning Approach
von: Abeßer, Jakob, et al.
Veröffentlicht: (2025)

An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS
von: Kunešová, Marie, et al.
Veröffentlicht: (2025)

Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech
von: Yang, Dong, et al.
Veröffentlicht: (2024)

Automatic Music Mixing using a Generative Model of Effect Embeddings
von: Moliner, Eloi, et al.
Veröffentlicht: (2025)

A Data-Driven Exploration of Elevation Cues in HRTFs: An Explainable AI Perspective Across Multiple Datasets
von: De Rus, Juan Antonio, et al.
Veröffentlicht: (2025)

MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling
von: Cheng, Yifan, et al.
Veröffentlicht: (2025)

Streaming Audio Transformers for Online Audio Tagging
von: Dinkel, Heinrich, et al.
Veröffentlicht: (2023)

Scaling up masked audio encoder learning for general audio classification
von: Dinkel, Heinrich, et al.
Veröffentlicht: (2024)

Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders
von: Sun, Xingwei, et al.
Veröffentlicht: (2025)

SPO-CLAPScore: Enhancing CLAP-based alignment prediction system with Standardize Preference Optimization, for the first XACLE Challenge
von: Takano, Taisei, et al.
Veröffentlicht: (2026)

Sound Field Translation and Mixed Source Model for Virtual Applications with Perceptual Validation
von: Birnie, Lachlan, et al.
Veröffentlicht: (2020)

E2E-AEC: Implementing an end-to-end neural network learning approach for acoustic echo cancellation
von: Jiang, Yiheng, et al.
Veröffentlicht: (2026)

X-ARES: A Comprehensive Framework for Assessing Audio Encoder Performance
von: Zhang, Junbo, et al.
Veröffentlicht: (2025)

Listening to Multi-talker Conversations: Modular and End-to-end Perspectives
von: Raj, Desh
Veröffentlicht: (2024)

A New Perspective on Speaker Verification: Joint Modeling with DFSMN and Transformer
von: Wang, Hongyu, et al.
Veröffentlicht: (2023)

Frequency-Domain Sound Field from the Perspective of Band-Limited Functions
von: Iwami, Takahiro, et al.
Veröffentlicht: (2024)

AC-Mix: Self-Supervised Adaptation for Low-Resource Automatic Speech Recognition using Agnostic Contrastive Mixup
von: Carvalho, Carlos, et al.
Veröffentlicht: (2024)

Examining the Interplay Between Privacy and Fairness for Speech Processing: A Review and Perspective
von: Leschanowsky, Anna, et al.
Veröffentlicht: (2024)

Speech Recognition for Analysis of Police Radio Communication
von: Srivastava, Tejes, et al.
Veröffentlicht: (2024)

Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
von: Wu, Shih-Lun, et al.
Veröffentlicht: (2023)

AudioEval: Automatic Dual-Perspective and Multi-Dimensional Evaluation of Text-to-Audio-Generation
von: Wang, Hui, et al.
Veröffentlicht: (2025)

Rethinking Speech Representation Aggregation in Speech Enhancement: A Phonetic Mutual Information Perspective
von: Han, Seungu, et al.
Veröffentlicht: (2026)

Communication conditions in virtual acoustic scenes in an underground station
von: Hládek, Ľuboš, et al.
Veröffentlicht: (2021)