Saved in:
| Main Authors: | Dong, Zhongren, Zhang, Zixing, Xu, Weixiang, Han, Jing, Ou, Jianjun, Schuller, Björn W. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.03952 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition
by: Zhang, Zixing, et al.
Published: (2024)
by: Zhang, Zixing, et al.
Published: (2024)
Intelligent Cardiac Auscultation for Murmur Detection via Parallel-Attentive Models with Uncertainty Estimation
by: Zhang, Zixing, et al.
Published: (2024)
by: Zhang, Zixing, et al.
Published: (2024)
ParaLBench: A Large-Scale Benchmark for Computational Paralinguistics over Acoustic Foundation Models
by: Zhang, Zixing, et al.
Published: (2024)
by: Zhang, Zixing, et al.
Published: (2024)
ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis
by: He, Xiangheng, et al.
Published: (2024)
by: He, Xiangheng, et al.
Published: (2024)
Quantifying Dimensional Independence in Speech: An Information-Theoretic Framework for Disentangled Representation Learning
by: Kashyap, Bipasha, et al.
Published: (2026)
by: Kashyap, Bipasha, et al.
Published: (2026)
Abusive Speech Detection in Indic Languages Using Acoustic Features
by: Spiesberger, Anika A., et al.
Published: (2024)
by: Spiesberger, Anika A., et al.
Published: (2024)
Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models
by: Jing, Xin, et al.
Published: (2024)
by: Jing, Xin, et al.
Published: (2024)
From Audio Deepfake Detection to AI-Generated Music Detection -- A Pathway and Overview
by: Li, Yupei, et al.
Published: (2024)
by: Li, Yupei, et al.
Published: (2024)
Leveraging Multimodal Methods and Spontaneous Speech for Alzheimer's Disease Identification
by: Gao, Yifan, et al.
Published: (2024)
by: Gao, Yifan, et al.
Published: (2024)
Integrating Pause Information with Word Embeddings in Language Models for Alzheimer's Disease Detection from Spontaneous Speech
by: Pu, Yu, et al.
Published: (2025)
by: Pu, Yu, et al.
Published: (2025)
ParaCLAP -- Towards a general language-audio model for computational paralinguistic tasks
by: Jing, Xin, et al.
Published: (2024)
by: Jing, Xin, et al.
Published: (2024)
Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers
by: Latif, Siddique, et al.
Published: (2023)
by: Latif, Siddique, et al.
Published: (2023)
Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition
by: Zhao, Yan, et al.
Published: (2024)
by: Zhao, Yan, et al.
Published: (2024)
Testing Correctness, Fairness, and Robustness of Speech Emotion Recognition Models
by: Derington, Anna, et al.
Published: (2023)
by: Derington, Anna, et al.
Published: (2023)
Charting 15 years of progress in deep learning for speech emotion recognition: A replication study
by: Triantafyllopoulos, Andreas, et al.
Published: (2025)
by: Triantafyllopoulos, Andreas, et al.
Published: (2025)
AffectSpeech: A Large-Scale Emotional Speech Dataset with Fine-Grained Textual Descriptions for Speech Emotion Captioning and Synthesis
by: Qi, Tianhua, et al.
Published: (2026)
by: Qi, Tianhua, et al.
Published: (2026)
Explainable Detection of Machine Generated Music and Early Systematic Evaluation
by: Li, Yupei, et al.
Published: (2024)
by: Li, Yupei, et al.
Published: (2024)
Leveraging Local and Global Knowledge Integration with Time-Frequency Calibrated Distillation for Speech Enhancement
by: Cheng, Jiaming, et al.
Published: (2025)
by: Cheng, Jiaming, et al.
Published: (2025)
Domain Adapting Deep Reinforcement Learning for Real-world Speech Emotion Recognition
by: Rajapakshe, Thejan, et al.
Published: (2022)
by: Rajapakshe, Thejan, et al.
Published: (2022)
Adaptive Speech Emotion Representation Learning Based On Dynamic Graph
by: Gao, Yingxue, et al.
Published: (2024)
by: Gao, Yingxue, et al.
Published: (2024)
S2ST-Omni: Hierarchical Language-Aware SpeechLLM Adaptation for Multilingual Speech-to-Speech Translation
by: Pan, Yu, et al.
Published: (2025)
by: Pan, Yu, et al.
Published: (2025)
STAA-Net: A Sparse and Transferable Adversarial Attack for Speech Emotion Recognition
by: Chang, Yi, et al.
Published: (2024)
by: Chang, Yi, et al.
Published: (2024)
M6: Multi-generator, Multi-domain, Multi-lingual and cultural, Multi-genres, Multi-instrument Machine-Generated Music Detection Databases
by: Li, Yupei, et al.
Published: (2024)
by: Li, Yupei, et al.
Published: (2024)
Cross-Dialect Bird Species Recognition with Dialect-Calibrated Augmentation
by: Ding, Jiani, et al.
Published: (2025)
by: Ding, Jiani, et al.
Published: (2025)
Representation Learning with Parameterised Quantum Circuits for Advancing Speech Emotion Recognition
by: Rajapakshe, Thejan, et al.
Published: (2025)
by: Rajapakshe, Thejan, et al.
Published: (2025)
Computer Audition: From Task-Specific Machine Learning to Foundation Models
by: Triantafyllopoulos, Andreas, et al.
Published: (2024)
by: Triantafyllopoulos, Andreas, et al.
Published: (2024)
Enhancing Speech Emotion Recognition Through Differentiable Architecture Search
by: Rajapakshe, Thejan, et al.
Published: (2023)
by: Rajapakshe, Thejan, et al.
Published: (2023)
Audio-based Step-count Estimation for Running -- Windowing and Neural Network Baselines
by: Wagner, Philipp, et al.
Published: (2024)
by: Wagner, Philipp, et al.
Published: (2024)
Wav2Small: Distilling Wav2Vec2 to 72K parameters for Low-Resource Speech emotion recognition
by: Kounadis-Bastian, Dionyssos, et al.
Published: (2024)
by: Kounadis-Bastian, Dionyssos, et al.
Published: (2024)
Are you sure? Analysing Uncertainty Quantification Approaches for Real-world Speech Emotion Recognition
by: Schrüfer, Oliver, et al.
Published: (2024)
by: Schrüfer, Oliver, et al.
Published: (2024)
Audio Explanation Synthesis with Generative Foundation Models
by: Akman, Alican, et al.
Published: (2024)
by: Akman, Alican, et al.
Published: (2024)
emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition
by: Rajapakshe, Thejan, et al.
Published: (2024)
by: Rajapakshe, Thejan, et al.
Published: (2024)
SmoothCLAP: Soft-Target Enhanced Contrastive Language\--Audio Pretraining for Affective Computing
by: Jing, Xin, et al.
Published: (2026)
by: Jing, Xin, et al.
Published: (2026)
MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge
by: Jing, Xin, et al.
Published: (2025)
by: Jing, Xin, et al.
Published: (2025)
An automatic analysis of ultrasound vocalisations for the prediction of interaction context in captive Egyptian fruit bats
by: Triantafyllopoulos, Andreas, et al.
Published: (2024)
by: Triantafyllopoulos, Andreas, et al.
Published: (2024)
DOTA-ME-CS: Daily Oriented Text Audio-Mandarin English-Code Switching Dataset
by: Li, Yupei, et al.
Published: (2025)
by: Li, Yupei, et al.
Published: (2025)
Using voice analysis as an early indicator of risk for depression in young adults
by: Scherer, Klaus R., et al.
Published: (2024)
by: Scherer, Klaus R., et al.
Published: (2024)
Speech as a Biomarker for Disease Detection
by: Botelho, Catarina, et al.
Published: (2024)
by: Botelho, Catarina, et al.
Published: (2024)
Improving Speaker-independent Speech Emotion Recognition Using Dynamic Joint Distribution Adaptation
by: Lu, Cheng, et al.
Published: (2024)
by: Lu, Cheng, et al.
Published: (2024)
Breaking Resource Barriers in Speech Emotion Recognition via Data Distillation
by: Chang, Yi, et al.
Published: (2024)
by: Chang, Yi, et al.
Published: (2024)
Similar Items
-
Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition
by: Zhang, Zixing, et al.
Published: (2024) -
Intelligent Cardiac Auscultation for Murmur Detection via Parallel-Attentive Models with Uncertainty Estimation
by: Zhang, Zixing, et al.
Published: (2024) -
ParaLBench: A Large-Scale Benchmark for Computational Paralinguistics over Acoustic Foundation Models
by: Zhang, Zixing, et al.
Published: (2024) -
ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis
by: He, Xiangheng, et al.
Published: (2024) -
Quantifying Dimensional Independence in Speech: An Information-Theoretic Framework for Disentangled Representation Learning
by: Kashyap, Bipasha, et al.
Published: (2026)