:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Homma, Takeshi, Sun, Qinghua, Fujioka, Takuya, Takawaki, Ryuta, Ankyu, Eriko, Nagamatsu, Kenji, Sugawara, Daichi, Harada, Etsuko T.
Format:	Preprint
Published:	2021
Subjects:	Robotics Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2109.12787
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

EmoSSLSphere: Multilingual Emotional Speech Synthesis with Spherical Vectors and Discrete Speech Tokens
by: Park, Joonyong, et al.
Published: (2025)

Hierarchical Control of Emotion Rendering in Speech Synthesis
by: Inoue, Sho, et al.
Published: (2024)

Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis
by: Inoue, Sho, et al.
Published: (2024)

AS-Speech: Adaptive Style For Speech Synthesis
by: Li, Zhipeng, et al.
Published: (2024)

ASR for Affective Speech: Investigating Impact of Emotion and Speech Generative Strategy
by: Wu, Ya-Tse, et al.
Published: (2026)

EME-TTS: Unlocking the Emphasis and Emotion Link in Speech Synthesis
by: Li, Haoxun, et al.
Published: (2025)

AffectSpeech: A Large-Scale Emotional Speech Dataset with Fine-Grained Textual Descriptions for Speech Emotion Captioning and Synthesis
by: Qi, Tianhua, et al.
Published: (2026)

KazEmoTTS: A Dataset for Kazakh Emotional Text-to-Speech Synthesis
by: Abilbekov, Adal, et al.
Published: (2024)

Textless and Non-Parallel Speech-to-Speech Emotion Style Transfer
by: Dutta, Soumya, et al.
Published: (2025)

Speech Quality-Based Localization of Low-Quality Speech and Text-to-Speech Synthesis Artefacts
by: Kuhlmann, Michael, et al.
Published: (2026)

RSET: Remapping-based Sorting Method for Emotion Transfer Speech Synthesis
by: Shi, Haoxiang, et al.
Published: (2024)

Reasoning Beyond Majority Vote: An Explainable SpeechLM Framework for Speech Emotion Recognition
by: Su, Bo-Hao, et al.
Published: (2025)

ED-TTS: Multi-Scale Emotion Modeling using Cross-Domain Emotion Diarization for Emotional Speech Synthesis
by: Tang, Haobin, et al.
Published: (2024)

Emotion-Coherent Speech Data Augmentation and Self-Supervised Contrastive Style Training for Enhancing Kids's Story Speech Synthesis
by: Chung, Raymond
Published: (2026)

HYFuse: Aligning Heterogeneous Speech Pre-Trained Representations in Hyperbolic Space for Speech Emotion Recognition
by: Phukan, Orchid Chetia, et al.
Published: (2025)

Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)

Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis
by: Wang, Tianrui, et al.
Published: (2025)

Multi-Step Prediction and Control of Hierarchical Emotion Distribution in Text-to-Speech Synthesis
by: Inoue, Sho, et al.
Published: (2025)

MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
by: Yang, Qian, et al.
Published: (2024)

Multi-Scale Temporal Transformer For Speech Emotion Recognition
by: Li, Zhipeng, et al.
Published: (2024)

Leveraging Content and Acoustic Representations for Speech Emotion Recognition
by: Dutta, Soumya, et al.
Published: (2024)

Few-shot Personalization via In-Context Learning for Speech Emotion Recognition based on Speech-Language Model
by: Ihori, Mana, et al.
Published: (2025)

Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions
by: Mi, Jinyi, et al.
Published: (2024)

Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
by: Shen, Siyuan, et al.
Published: (2024)

Interleaved Speech-Text Language Models for Simple Streaming Text-to-Speech Synthesis
by: Yang, Yifan, et al.
Published: (2024)

ART: The Alternating Reading Task Corpus for Speech Entrainment and Imitation
by: Yuan, Zheng, et al.
Published: (2024)

FUSE: Universal Speech Enhancement using Multi-Stage Fusion of Sparse Compression and Token Generation Models for the URGENT 2025 Challenge
by: Goswami, Nabarun, et al.
Published: (2025)

How Attention Shapes Emotion: A Comparative Study of Attention Mechanisms for Speech Emotion Recognition
by: Casals-Salvador, Marc, et al.
Published: (2026)

Revisiting Modeling and Evaluation Approaches in Speech Emotion Recognition: Considering Subjectivity of Annotators and Ambiguity of Emotions
by: Chou, Huang-Cheng, et al.
Published: (2025)

Distinguishing Neural Speech Synthesis Models Through Fingerprints in Speech Waveforms
by: Zhang, Chu Yuan, et al.
Published: (2023)

Speech to Speech Synthesis for Voice Impersonation
by: Johnson, Bjorn, et al.
Published: (2026)

Fine-Grained Quantitative Emotion Editing for Speech Generation
by: Inoue, Sho, et al.
Published: (2024)

EMO-SUPERB: An In-depth Look at Speech Emotion Recognition
by: Wu, Haibin, et al.
Published: (2024)

Dataset-Distillation Generative Model for Speech Emotion Recognition
by: Ritter-Gutierrez, Fabian, et al.
Published: (2024)

THAI Speech Emotion Recognition (THAI-SER) corpus
by: Wongpithayadisai, Jilamika, et al.
Published: (2025)

Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition
by: Sun, Haoqin, et al.
Published: (2024)

EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model
by: Yang, Yiqing, et al.
Published: (2025)

Enhancing In-the-Wild Speech Emotion Conversion with Resynthesis-based Duration Modeling
by: Prabhu, Navin Raj, et al.
Published: (2025)

Speech Emotion Recognition Via CNN-Transformer and Multidimensional Attention Mechanism
by: Tang, Xiaoyu, et al.
Published: (2024)

Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech
by: Zhou, Enting, et al.
Published: (2023)