Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Homma, Takeshi, Sun, Qinghua, Fujioka, Takuya, Takawaki, Ryuta, Ankyu, Eriko, Nagamatsu, Kenji, Sugawara, Daichi, Harada, Etsuko T.
Format:	Preprint
Published:	2021
Subjects:	Robotics Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2109.12787
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929559286317056
author	Homma, Takeshi Sun, Qinghua Fujioka, Takuya Takawaki, Ryuta Ankyu, Eriko Nagamatsu, Kenji Sugawara, Daichi Harada, Etsuko T.
author_facet	Homma, Takeshi Sun, Qinghua Fujioka, Takuya Takawaki, Ryuta Ankyu, Eriko Nagamatsu, Kenji Sugawara, Daichi Harada, Etsuko T.
contents	When people try to influence others to do something, they subconsciously adjust their speech to include appropriate emotional information. In order for a robot to influence people in the same way, the robot should be able to imitate the range of human emotions when speaking. To achieve this, we propose a speech synthesis method for imitating the emotional states in human speech. In contrast to previous methods, the advantage of our method is that it requires less manual effort to adjust the emotion of the synthesized speech. Our synthesizer receives an emotion vector to characterize the emotion of synthesized speech. The vector is automatically obtained from human utterances by using a speech emotion recognizer. We evaluated our method in a scenario when a robot tries to regulate an elderly person's circadian rhythm by speaking to the person using appropriate emotional states. For the target speech to imitate, we collected utterances from professional caregivers when they speak to elderly people at different times of the day. Then we conducted a subjective evaluation where the elderly participants listened to the speech samples generated by our method. The results showed that listening to the samples made the participants feel more active in the early morning and calmer in the middle of the night. This suggests that the robot may be able to adjust the participants' circadian rhythm and that the robot can potentially exert influence similarly to a person.
format	Preprint
id	arxiv_https___arxiv_org_abs_2109_12787
institution	arXiv
publishDate	2021
record_format	arxiv
spellingShingle	Emotional Speech Synthesis for Companion Robot to Imitate Professional Caregiver Speech Homma, Takeshi Sun, Qinghua Fujioka, Takuya Takawaki, Ryuta Ankyu, Eriko Nagamatsu, Kenji Sugawara, Daichi Harada, Etsuko T. Robotics Audio and Speech Processing When people try to influence others to do something, they subconsciously adjust their speech to include appropriate emotional information. In order for a robot to influence people in the same way, the robot should be able to imitate the range of human emotions when speaking. To achieve this, we propose a speech synthesis method for imitating the emotional states in human speech. In contrast to previous methods, the advantage of our method is that it requires less manual effort to adjust the emotion of the synthesized speech. Our synthesizer receives an emotion vector to characterize the emotion of synthesized speech. The vector is automatically obtained from human utterances by using a speech emotion recognizer. We evaluated our method in a scenario when a robot tries to regulate an elderly person's circadian rhythm by speaking to the person using appropriate emotional states. For the target speech to imitate, we collected utterances from professional caregivers when they speak to elderly people at different times of the day. Then we conducted a subjective evaluation where the elderly participants listened to the speech samples generated by our method. The results showed that listening to the samples made the participants feel more active in the early morning and calmer in the middle of the night. This suggests that the robot may be able to adjust the participants' circadian rhythm and that the robot can potentially exert influence similarly to a person.
title	Emotional Speech Synthesis for Companion Robot to Imitate Professional Caregiver Speech
topic	Robotics Audio and Speech Processing
url	https://arxiv.org/abs/2109.12787

Similar Items