Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Luo, Zhenye, Ren, Min, Hu, Xuecai, Huang, Yongzhen, Yao, Li
Format:	Preprint
Published:	2024
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2405.03178
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913625923387392
author	Luo, Zhenye Ren, Min Hu, Xuecai Huang, Yongzhen Yao, Li
author_facet	Luo, Zhenye Ren, Min Hu, Xuecai Huang, Yongzhen Yao, Li
contents	Generating dances that are both lifelike and well-aligned with music continues to be a challenging task in the cross-modal domain. This paper introduces PopDanceSet, the first dataset tailored to the preferences of young audiences, enabling the generation of aesthetically oriented dances. And it surpasses the AIST++ dataset in music genre diversity and the intricacy and depth of dance movements. Moreover, the proposed POPDG model within the iDDPM framework enhances dance diversity and, through the Space Augmentation Algorithm, strengthens spatial physical connections between human body joints, ensuring that increased diversity does not compromise generation quality. A streamlined Alignment Module is also designed to improve the temporal alignment between dance and music. Extensive experiments show that POPDG achieves SOTA results on two datasets. Furthermore, the paper also expands on current evaluation metrics. The dataset and code are available at https://github.com/Luke-Luo1/POPDG.
format	Preprint
id	arxiv_https___arxiv_org_abs_2405_03178
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	POPDG: Popular 3D Dance Generation with PopDanceSet Luo, Zhenye Ren, Min Hu, Xuecai Huang, Yongzhen Yao, Li Sound Audio and Speech Processing Generating dances that are both lifelike and well-aligned with music continues to be a challenging task in the cross-modal domain. This paper introduces PopDanceSet, the first dataset tailored to the preferences of young audiences, enabling the generation of aesthetically oriented dances. And it surpasses the AIST++ dataset in music genre diversity and the intricacy and depth of dance movements. Moreover, the proposed POPDG model within the iDDPM framework enhances dance diversity and, through the Space Augmentation Algorithm, strengthens spatial physical connections between human body joints, ensuring that increased diversity does not compromise generation quality. A streamlined Alignment Module is also designed to improve the temporal alignment between dance and music. Extensive experiments show that POPDG achieves SOTA results on two datasets. Furthermore, the paper also expands on current evaluation metrics. The dataset and code are available at https://github.com/Luke-Luo1/POPDG.
title	POPDG: Popular 3D Dance Generation with PopDanceSet
topic	Sound Audio and Speech Processing
url	https://arxiv.org/abs/2405.03178

Similar Items