Saved in:
| Main Authors: | Homma, Takeshi, Sun, Qinghua, Fujioka, Takuya, Takawaki, Ryuta, Ankyu, Eriko, Nagamatsu, Kenji, Sugawara, Daichi, Harada, Etsuko T. |
|---|---|
| Format: | Preprint |
| Published: |
2021
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2109.12787 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EmoSSLSphere: Multilingual Emotional Speech Synthesis with Spherical Vectors and Discrete Speech Tokens
by: Park, Joonyong, et al.
Published: (2025)
by: Park, Joonyong, et al.
Published: (2025)
Hierarchical Control of Emotion Rendering in Speech Synthesis
by: Inoue, Sho, et al.
Published: (2024)
by: Inoue, Sho, et al.
Published: (2024)
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
by: Inoue, Sho, et al.
Published: (2024)
by: Inoue, Sho, et al.
Published: (2024)
AS-Speech: Adaptive Style For Speech Synthesis
by: Li, Zhipeng, et al.
Published: (2024)
by: Li, Zhipeng, et al.
Published: (2024)
ASR for Affective Speech: Investigating Impact of Emotion and Speech Generative Strategy
by: Wu, Ya-Tse, et al.
Published: (2026)
by: Wu, Ya-Tse, et al.
Published: (2026)
EME-TTS: Unlocking the Emphasis and Emotion Link in Speech Synthesis
by: Li, Haoxun, et al.
Published: (2025)
by: Li, Haoxun, et al.
Published: (2025)
AffectSpeech: A Large-Scale Emotional Speech Dataset with Fine-Grained Textual Descriptions for Speech Emotion Captioning and Synthesis
by: Qi, Tianhua, et al.
Published: (2026)
by: Qi, Tianhua, et al.
Published: (2026)
KazEmoTTS: A Dataset for Kazakh Emotional Text-to-Speech Synthesis
by: Abilbekov, Adal, et al.
Published: (2024)
by: Abilbekov, Adal, et al.
Published: (2024)
Textless and Non-Parallel Speech-to-Speech Emotion Style Transfer
by: Dutta, Soumya, et al.
Published: (2025)
by: Dutta, Soumya, et al.
Published: (2025)
Speech Quality-Based Localization of Low-Quality Speech and Text-to-Speech Synthesis Artefacts
by: Kuhlmann, Michael, et al.
Published: (2026)
by: Kuhlmann, Michael, et al.
Published: (2026)
RSET: Remapping-based Sorting Method for Emotion Transfer Speech Synthesis
by: Shi, Haoxiang, et al.
Published: (2024)
by: Shi, Haoxiang, et al.
Published: (2024)
Reasoning Beyond Majority Vote: An Explainable SpeechLM Framework for Speech Emotion Recognition
by: Su, Bo-Hao, et al.
Published: (2025)
by: Su, Bo-Hao, et al.
Published: (2025)
ED-TTS: Multi-Scale Emotion Modeling using Cross-Domain Emotion Diarization for Emotional Speech Synthesis
by: Tang, Haobin, et al.
Published: (2024)
by: Tang, Haobin, et al.
Published: (2024)
Emotion-Coherent Speech Data Augmentation and Self-Supervised Contrastive Style Training for Enhancing Kids's Story Speech Synthesis
by: Chung, Raymond
Published: (2026)
by: Chung, Raymond
Published: (2026)
HYFuse: Aligning Heterogeneous Speech Pre-Trained Representations in Hyperbolic Space for Speech Emotion Recognition
by: Phukan, Orchid Chetia, et al.
Published: (2025)
by: Phukan, Orchid Chetia, et al.
Published: (2025)
Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)
by: Li, Yuanchao
Published: (2026)
Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis
by: Wang, Tianrui, et al.
Published: (2025)
by: Wang, Tianrui, et al.
Published: (2025)
Multi-Step Prediction and Control of Hierarchical Emotion Distribution in Text-to-Speech Synthesis
by: Inoue, Sho, et al.
Published: (2025)
by: Inoue, Sho, et al.
Published: (2025)
MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
by: Yang, Qian, et al.
Published: (2024)
by: Yang, Qian, et al.
Published: (2024)
Multi-Scale Temporal Transformer For Speech Emotion Recognition
by: Li, Zhipeng, et al.
Published: (2024)
by: Li, Zhipeng, et al.
Published: (2024)
Leveraging Content and Acoustic Representations for Speech Emotion Recognition
by: Dutta, Soumya, et al.
Published: (2024)
by: Dutta, Soumya, et al.
Published: (2024)
Few-shot Personalization via In-Context Learning for Speech Emotion Recognition based on Speech-Language Model
by: Ihori, Mana, et al.
Published: (2025)
by: Ihori, Mana, et al.
Published: (2025)
Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions
by: Mi, Jinyi, et al.
Published: (2024)
by: Mi, Jinyi, et al.
Published: (2024)
Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
by: Shen, Siyuan, et al.
Published: (2024)
by: Shen, Siyuan, et al.
Published: (2024)
Interleaved Speech-Text Language Models for Simple Streaming Text-to-Speech Synthesis
by: Yang, Yifan, et al.
Published: (2024)
by: Yang, Yifan, et al.
Published: (2024)
ART: The Alternating Reading Task Corpus for Speech Entrainment and Imitation
by: Yuan, Zheng, et al.
Published: (2024)
by: Yuan, Zheng, et al.
Published: (2024)
FUSE: Universal Speech Enhancement using Multi-Stage Fusion of Sparse Compression and Token Generation Models for the URGENT 2025 Challenge
by: Goswami, Nabarun, et al.
Published: (2025)
by: Goswami, Nabarun, et al.
Published: (2025)
How Attention Shapes Emotion: A Comparative Study of Attention Mechanisms for Speech Emotion Recognition
by: Casals-Salvador, Marc, et al.
Published: (2026)
by: Casals-Salvador, Marc, et al.
Published: (2026)
Revisiting Modeling and Evaluation Approaches in Speech Emotion Recognition: Considering Subjectivity of Annotators and Ambiguity of Emotions
by: Chou, Huang-Cheng, et al.
Published: (2025)
by: Chou, Huang-Cheng, et al.
Published: (2025)
Distinguishing Neural Speech Synthesis Models Through Fingerprints in Speech Waveforms
by: Zhang, Chu Yuan, et al.
Published: (2023)
by: Zhang, Chu Yuan, et al.
Published: (2023)
Speech to Speech Synthesis for Voice Impersonation
by: Johnson, Bjorn, et al.
Published: (2026)
by: Johnson, Bjorn, et al.
Published: (2026)
Fine-Grained Quantitative Emotion Editing for Speech Generation
by: Inoue, Sho, et al.
Published: (2024)
by: Inoue, Sho, et al.
Published: (2024)
EMO-SUPERB: An In-depth Look at Speech Emotion Recognition
by: Wu, Haibin, et al.
Published: (2024)
by: Wu, Haibin, et al.
Published: (2024)
Dataset-Distillation Generative Model for Speech Emotion Recognition
by: Ritter-Gutierrez, Fabian, et al.
Published: (2024)
by: Ritter-Gutierrez, Fabian, et al.
Published: (2024)
THAI Speech Emotion Recognition (THAI-SER) corpus
by: Wongpithayadisai, Jilamika, et al.
Published: (2025)
by: Wongpithayadisai, Jilamika, et al.
Published: (2025)
Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition
by: Sun, Haoqin, et al.
Published: (2024)
by: Sun, Haoqin, et al.
Published: (2024)
EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model
by: Yang, Yiqing, et al.
Published: (2025)
by: Yang, Yiqing, et al.
Published: (2025)
Enhancing In-the-Wild Speech Emotion Conversion with Resynthesis-based Duration Modeling
by: Prabhu, Navin Raj, et al.
Published: (2025)
by: Prabhu, Navin Raj, et al.
Published: (2025)
Speech Emotion Recognition Via CNN-Transformer and Multidimensional Attention Mechanism
by: Tang, Xiaoyu, et al.
Published: (2024)
by: Tang, Xiaoyu, et al.
Published: (2024)
Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech
by: Zhou, Enting, et al.
Published: (2023)
by: Zhou, Enting, et al.
Published: (2023)
Similar Items
-
EmoSSLSphere: Multilingual Emotional Speech Synthesis with Spherical Vectors and Discrete Speech Tokens
by: Park, Joonyong, et al.
Published: (2025) -
Hierarchical Control of Emotion Rendering in Speech Synthesis
by: Inoue, Sho, et al.
Published: (2024) -
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
by: Inoue, Sho, et al.
Published: (2024) -
AS-Speech: Adaptive Style For Speech Synthesis
by: Li, Zhipeng, et al.
Published: (2024) -
ASR for Affective Speech: Investigating Impact of Emotion and Speech Generative Strategy
by: Wu, Ya-Tse, et al.
Published: (2026)