Guardado en:
| Autores principales: | Hallmen, Tobias, Deuser, Fabian, Oswald, Norbert, André, Elisabeth |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2403.11879 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech
por: Cho, Deok-Hyeon, et al.
Publicado: (2024)
por: Cho, Deok-Hyeon, et al.
Publicado: (2024)
Learning Frame-Wise Emotion Intensity for Audio-Driven Talking-Head Generation
por: Xu, Jingyi, et al.
Publicado: (2024)
por: Xu, Jingyi, et al.
Publicado: (2024)
Facial Expression-Enhanced TTS: Combining Face Representation and Emotion Intensity for Adaptive Speech
por: Chu, Yunji, et al.
Publicado: (2024)
por: Chu, Yunji, et al.
Publicado: (2024)
MFHCA: Enhancing Speech Emotion Recognition Via Multi-Spatial Fusion and Hierarchical Cooperative Attention
por: Jiao, Xinxin, et al.
Publicado: (2024)
por: Jiao, Xinxin, et al.
Publicado: (2024)
Audio-Guided Fusion Techniques for Multimodal Emotion Analysis
por: Shi, Pujin, et al.
Publicado: (2024)
por: Shi, Pujin, et al.
Publicado: (2024)
Efficient Feature Extraction and Late Fusion Strategy for Audiovisual Emotional Mimicry Intensity Estimation
por: Yu, Jun, et al.
Publicado: (2024)
por: Yu, Jun, et al.
Publicado: (2024)
Wearable Music2Emotion : Assessing Emotions Induced by AI-Generated Music through Portable EEG-fNIRS Fusion
por: Zhao, Sha, et al.
Publicado: (2025)
por: Zhao, Sha, et al.
Publicado: (2025)
Active Learning with Task Adaptation Pre-training for Speech Emotion Recognition
por: Li, Dongyuan, et al.
Publicado: (2024)
por: Li, Dongyuan, et al.
Publicado: (2024)
EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion
por: Gudmalwar, Ashishkumar, et al.
Publicado: (2024)
por: Gudmalwar, Ashishkumar, et al.
Publicado: (2024)
Disentangling Reasoning in Large Audio-Language Models for Ambiguous Emotion Prediction
por: Yu, Xiaofeng, et al.
Publicado: (2026)
por: Yu, Xiaofeng, et al.
Publicado: (2026)
Expressive Prompting: Improving Emotion Intensity and Speaker Consistency in Zero-Shot TTS
por: Wang, Haoyu, et al.
Publicado: (2024)
por: Wang, Haoyu, et al.
Publicado: (2024)
Explaining Deep Learning Embeddings for Speech Emotion Recognition by Predicting Interpretable Acoustic Features
por: Dixit, Satvik, et al.
Publicado: (2024)
por: Dixit, Satvik, et al.
Publicado: (2024)
ML-SAN: Multi-Level Speaker-Adaptive Network for Emotion Recognition in Conversations
por: Wang, Kexue, et al.
Publicado: (2026)
por: Wang, Kexue, et al.
Publicado: (2026)
Color-based Emotion Representation for Speech Emotion Recognition
por: Nagase, Ryotaro, et al.
Publicado: (2026)
por: Nagase, Ryotaro, et al.
Publicado: (2026)
Are We There Yet? A Brief Survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges
por: Kang, Jaeyong, et al.
Publicado: (2024)
por: Kang, Jaeyong, et al.
Publicado: (2024)
MultiVerse: Efficient and Expressive Zero-Shot Multi-Task Text-to-Speech
por: Bak, Taejun, et al.
Publicado: (2024)
por: Bak, Taejun, et al.
Publicado: (2024)
MPE-TTS: Customized Emotion Zero-Shot Text-To-Speech Using Multi-Modal Prompt
por: Wu, Zhichao, et al.
Publicado: (2025)
por: Wu, Zhichao, et al.
Publicado: (2025)
Multi-Loss Learning for Speech Emotion Recognition with Energy-Adaptive Mixup and Frame-Level Attention
por: Wang, Cong, et al.
Publicado: (2025)
por: Wang, Cong, et al.
Publicado: (2025)
Emotional Text-To-Speech Based on Mutual-Information-Guided Emotion-Timbre Disentanglement
por: Yang, Jianing, et al.
Publicado: (2025)
por: Yang, Jianing, et al.
Publicado: (2025)
Sync-TVA: A Graph-Attention Framework for Multimodal Emotion Recognition with Cross-Modal Fusion
por: Deng, Zeyu, et al.
Publicado: (2025)
por: Deng, Zeyu, et al.
Publicado: (2025)
GMP-TL: Gender-augmented Multi-scale Pseudo-label Enhanced Transfer Learning for Speech Emotion Recognition
por: Pan, Yu, et al.
Publicado: (2024)
por: Pan, Yu, et al.
Publicado: (2024)
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition
por: Wang, He, et al.
Publicado: (2024)
por: Wang, He, et al.
Publicado: (2024)
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector
por: Cho, Deok-Hyeon, et al.
Publicado: (2024)
por: Cho, Deok-Hyeon, et al.
Publicado: (2024)
EffiFusion-GAN: Efficient Fusion Generative Adversarial Network for Speech Enhancement
por: Wen, Bin, et al.
Publicado: (2025)
por: Wen, Bin, et al.
Publicado: (2025)
Beyond Discrete Categories: Multi-Task Valence-Arousal Modeling for Pet Vocalization Analysis
por: Huang, Junyao, et al.
Publicado: (2025)
por: Huang, Junyao, et al.
Publicado: (2025)
Joint Learning of Emotions in Music and Generalized Sounds
por: Simonetta, Federico, et al.
Publicado: (2024)
por: Simonetta, Federico, et al.
Publicado: (2024)
DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech
por: Cho, Deok-Hyeon, et al.
Publicado: (2025)
por: Cho, Deok-Hyeon, et al.
Publicado: (2025)
SongGLM: Lyric-to-Melody Generation with 2D Alignment Encoding and Multi-Task Pre-Training
por: Yu, Jiaxing, et al.
Publicado: (2024)
por: Yu, Jiaxing, et al.
Publicado: (2024)
FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion
por: Chen, Shunian, et al.
Publicado: (2025)
por: Chen, Shunian, et al.
Publicado: (2025)
Persian Speech Emotion Recognition by Fine-Tuning Transformers
por: Shayaninasab, Minoo, et al.
Publicado: (2024)
por: Shayaninasab, Minoo, et al.
Publicado: (2024)
Leveraging Label Potential for Enhanced Multimodal Emotion Recognition
por: Shao, Xuechun, et al.
Publicado: (2025)
por: Shao, Xuechun, et al.
Publicado: (2025)
MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection
por: Mu, Da, et al.
Publicado: (2024)
por: Mu, Da, et al.
Publicado: (2024)
Semi-Supervised Self-Learning Enhanced Music Emotion Recognition
por: Sun, Yifu, et al.
Publicado: (2024)
por: Sun, Yifu, et al.
Publicado: (2024)
Efficient Finetuning for Dimensional Speech Emotion Recognition in the Age of Transformers
por: Sampath, Aneesha, et al.
Publicado: (2025)
por: Sampath, Aneesha, et al.
Publicado: (2025)
Disentangled Dual-Branch Graph Learning for Conversational Emotion Recognition
por: Guo, Chengling, et al.
Publicado: (2026)
por: Guo, Chengling, et al.
Publicado: (2026)
Abstract Sound Fusion with Unconditional Inversion Models
por: Liu, Jing, et al.
Publicado: (2025)
por: Liu, Jing, et al.
Publicado: (2025)
Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding
por: Nguyen, Tan Dat, et al.
Publicado: (2024)
por: Nguyen, Tan Dat, et al.
Publicado: (2024)
Breaking Resource Barriers in Speech Emotion Recognition via Data Distillation
por: Chang, Yi, et al.
Publicado: (2024)
por: Chang, Yi, et al.
Publicado: (2024)
Emotion-Driven Melody Harmonization via Melodic Variation and Functional Representation
por: Huang, Jingyue, et al.
Publicado: (2024)
por: Huang, Jingyue, et al.
Publicado: (2024)
Learning Physiology-Informed Vocal Spectrotemporal Representations for Speech Emotion Recognition
por: Zhang, Xu, et al.
Publicado: (2026)
por: Zhang, Xu, et al.
Publicado: (2026)
Ejemplares similares
-
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech
por: Cho, Deok-Hyeon, et al.
Publicado: (2024) -
Learning Frame-Wise Emotion Intensity for Audio-Driven Talking-Head Generation
por: Xu, Jingyi, et al.
Publicado: (2024) -
Facial Expression-Enhanced TTS: Combining Face Representation and Emotion Intensity for Adaptive Speech
por: Chu, Yunji, et al.
Publicado: (2024) -
MFHCA: Enhancing Speech Emotion Recognition Via Multi-Spatial Fusion and Hierarchical Cooperative Attention
por: Jiao, Xinxin, et al.
Publicado: (2024) -
Audio-Guided Fusion Techniques for Multimodal Emotion Analysis
por: Shi, Pujin, et al.
Publicado: (2024)