:: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Triantafyllopoulos, Andreas, Šťastný, Jakub, Terpinas, Alexios, Liu, Tianyi, Wang, Yuanqi, Schuller, Björn W.
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Sound
Online-Zugang:	https://arxiv.org/abs/2605.19984
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Charting 15 years of progress in deep learning for speech emotion recognition: A replication study
von: Triantafyllopoulos, Andreas, et al.
Veröffentlicht: (2025)

ParaCLAP -- Towards a general language-audio model for computational paralinguistic tasks
von: Jing, Xin, et al.
Veröffentlicht: (2024)

Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models
von: Jing, Xin, et al.
Veröffentlicht: (2024)

Audio-based Step-count Estimation for Running -- Windowing and Neural Network Baselines
von: Wagner, Philipp, et al.
Veröffentlicht: (2024)

Abusive Speech Detection in Indic Languages Using Acoustic Features
von: Spiesberger, Anika A., et al.
Veröffentlicht: (2024)

autrainer: A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks
von: Rampp, Simon, et al.
Veröffentlicht: (2024)

EmoSURA: Towards Accurate Evaluation of Detailed and Long-Context Emotional Speech Captions
von: Jing, Xin, et al.
Veröffentlicht: (2026)

Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance
von: Milling, Manuel, et al.
Veröffentlicht: (2024)

MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge
von: Jing, Xin, et al.
Veröffentlicht: (2025)

An automatic analysis of ultrasound vocalisations for the prediction of interaction context in captive Egyptian fruit bats
von: Triantafyllopoulos, Andreas, et al.
Veröffentlicht: (2024)

Computer Audition: From Task-Specific Machine Learning to Foundation Models
von: Triantafyllopoulos, Andreas, et al.
Veröffentlicht: (2024)

Exploring Meta Information for Audio-based Zero-shot Bird Classification
von: Gebhard, Alexander, et al.
Veröffentlicht: (2023)

SmoothCLAP: Soft-Target Enhanced Contrastive Language\--Audio Pretraining for Affective Computing
von: Jing, Xin, et al.
Veröffentlicht: (2026)

Bringing the Discussion of Minima Sharpness to the Audio Domain: a Filter-Normalised Evaluation for Acoustic Scene Classification
von: Milling, Manuel, et al.
Veröffentlicht: (2023)

DFingerNet: Noise-Adaptive Speech Enhancement for Hearing Aids
von: Tsangko, Iosif, et al.
Veröffentlicht: (2025)

Detecting COPD Through Speech Analysis: A Dataset of Danish Speech and Machine Learning Approach
von: Sankey-Olsen, Cuno, et al.
Veröffentlicht: (2025)

Expressivity and Speech Synthesis
von: Triantafyllopoulos, Andreas, et al.
Veröffentlicht: (2024)

Enrolment-based personalisation for improving individual-level fairness in speech emotion recognition
von: Triantafyllopoulos, Andreas, et al.
Veröffentlicht: (2024)

Enrolment-based personalisation for improving individual-level fairness in speech emotion recognition
von: Triantafyllopoulos, Andreas, et al.
Veröffentlicht: (2024)

Quantifying Dimensional Independence in Speech: An Information-Theoretic Framework for Disentangled Representation Learning
von: Kashyap, Bipasha, et al.
Veröffentlicht: (2026)

Audio Explanation Synthesis with Generative Foundation Models
von: Akman, Alican, et al.
Veröffentlicht: (2024)

The Affective Bridge: Preserving Speech Representations while Enhancing Deepfake Detection vian emotional Constraints
von: Li, Yupei, et al.
Veröffentlicht: (2025)

From Audio Deepfake Detection to AI-Generated Music Detection -- A Pathway and Overview
von: Li, Yupei, et al.
Veröffentlicht: (2024)

Intelligent Cardiac Auscultation for Murmur Detection via Parallel-Attentive Models with Uncertainty Estimation
von: Zhang, Zixing, et al.
Veröffentlicht: (2024)

Cross-Dialect Bird Species Recognition with Dialect-Calibrated Augmentation
von: Ding, Jiani, et al.
Veröffentlicht: (2025)

Enhancing Efficiency and Performance in Deepfake Audio Detection through Neuron-level Dropin & Neuroplasticity Mechanisms
von: Li, Yupei, et al.
Veröffentlicht: (2026)

M6: Multi-generator, Multi-domain, Multi-lingual and cultural, Multi-genres, Multi-instrument Machine-Generated Music Detection Databases
von: Li, Yupei, et al.
Veröffentlicht: (2024)

Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers
von: Latif, Siddique, et al.
Veröffentlicht: (2023)

DFALLM: Achieving Generalizable Multitask Deepfake Detection by Optimizing Audio LLM Components
von: Li, Yupei, et al.
Veröffentlicht: (2025)

ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis
von: He, Xiangheng, et al.
Veröffentlicht: (2024)

Domain Adapting Deep Reinforcement Learning for Real-world Speech Emotion Recognition
von: Rajapakshe, Thejan, et al.
Veröffentlicht: (2022)

Explainable Detection of Machine Generated Music and Early Systematic Evaluation
von: Li, Yupei, et al.
Veröffentlicht: (2024)

AffectSpeech: A Large-Scale Emotional Speech Dataset with Fine-Grained Textual Descriptions for Speech Emotion Captioning and Synthesis
von: Qi, Tianhua, et al.
Veröffentlicht: (2026)

Representation Learning with Parameterised Quantum Circuits for Advancing Speech Emotion Recognition
von: Rajapakshe, Thejan, et al.
Veröffentlicht: (2025)

Raw Audio Classification with Cosine Convolutional Neural Network (CosCovNN)
von: Haque, Kazi Nazmul, et al.
Veröffentlicht: (2024)

DOTA-ME-CS: Daily Oriented Text Audio-Mandarin English-Code Switching Dataset
von: Li, Yupei, et al.
Veröffentlicht: (2025)

Testing Correctness, Fairness, and Robustness of Speech Emotion Recognition Models
von: Derington, Anna, et al.
Veröffentlicht: (2023)

Using voice analysis as an early indicator of risk for depression in young adults
von: Scherer, Klaus R., et al.
Veröffentlicht: (2024)

Are you really listening? Boosting Perceptual Awareness in Music-QA Benchmarks
von: Zang, Yongyi, et al.
Veröffentlicht: (2025)

Non-Verbal Vocalisations and their Challenges: Emotion, Privacy, Sparseness, and Real Life
von: Batliner, Anton, et al.
Veröffentlicht: (2025)