:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Qin, Yawen, Qiu, Ke, Zhang, Qin
Format:	Preprint
Published:	2026
Subjects:	Multimedia
Online Access:	https://arxiv.org/abs/2605.00824
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CoheDancers: Enhancing Interactive Group Dance Generation through Music-Driven Coherence Decomposition
by: Yang, Kaixing, et al.
Published: (2024)

DanceCamAnimator: Keyframe-Based Controllable 3D Dance Camera Synthesis
by: Wang, Zixuan, et al.
Published: (2024)

Dance2MIDI: Dance-driven multi-instruments music generation
by: Han, Bo, et al.
Published: (2023)

DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance
by: Wang, Zixuan, et al.
Published: (2024)

DanceEditor: Towards Iterative Editable Music-driven Dance Generation with Open-Vocabulary Descriptions
by: Zhang, Hengyuan, et al.
Published: (2025)

InterDance:Reactive 3D Dance Generation with Realistic Duet Interactions
by: Li, Ronghui, et al.
Published: (2024)

CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization
by: Chen, Nan, et al.
Published: (2024)

DanceChat: Large Language Model-Guided Music-to-Dance Generation
by: Wang, Qing, et al.
Published: (2025)

DanceAnyWay: Synthesizing Beat-Guided 3D Dances with Randomized Temporal Contrastive Learning
by: Bhattacharya, Aneesh, et al.
Published: (2023)

MG-Former: A Transformer-Based Framework for Music-Driven 3D Conducting Gesture Generation
by: Qiu, Ke, et al.
Published: (2026)

MDD: A Dataset for Text-and-Music Conditioned Duet Dance Generation
by: Gupta, Prerit, et al.
Published: (2025)

Dance Any Beat: Blending Beats with Visuals in Dance Video Generation
by: Wang, Xuanchen, et al.
Published: (2024)

Dance-to-Music Generation with Encoder-based Textual Inversion
by: Li, Sifei, et al.
Published: (2024)

MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation
by: Yang, Kaixing, et al.
Published: (2025)

Enhancing Expressiveness in Dance Generation via Integrating Frequency and Music Style Information
by: Huang, Qiaochu, et al.
Published: (2024)

Omni-Customizer: End-to-End MultiModal Customization for Joint Audio-Video Generation
by: Chen, Yuheng, et al.
Published: (2026)

Controllable Dance Generation with Style-Guided Motion Diffusion
by: Wang, Hongsong, et al.
Published: (2024)

Music-Aligned Holistic 3D Dance Generation via Hierarchical Motion Modeling
by: Li, Xiaojie, et al.
Published: (2025)

Benchmarking Sub-Genre Classification For Mainstage Dance Music
by: Shu, Hongzhi, et al.
Published: (2024)

MATHDance: Mamba-Transformer Architecture with Uniform Tokenization for High-Quality 3D Dance Generation
by: Yang, Kaixing, et al.
Published: (2025)

LM2D: Lyrics- and Music-Driven Dance Synthesis
by: Yin, Wenjie, et al.
Published: (2024)

OmniCustom: Sync Audio-Video Customization Via Joint Audio-Video Generation Model
by: Li, Maomao, et al.
Published: (2026)

Tempo as the Stable Cue: Hierarchical Mixture of Tempo and Beat Experts for Music to 3D Dance Generation
by: Lyu, Guangtao, et al.
Published: (2025)

StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation
by: Wu, Yi, et al.
Published: (2025)

QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation
by: Zhou, Zhizhen, et al.
Published: (2024)

Anchorage: Visual Analysis of Satisfaction in Customer Service Videos via Anchor Events
by: Wong, Kam Kwai, et al.
Published: (2023)

GACA-DiT: Diffusion-based Dance-to-Music Generation with Genre-Adaptive Rhythm and Context-Aware Alignment
by: Wang, Jinting, et al.
Published: (2025)

User Digital Twin-Driven Video Streaming for Customized Preferences and Adaptive Transcoding
by: Jimmy, Stephen, et al.
Published: (2024)

ChoreoMuse: Robust Music-to-Dance Video Generation with Style Transfer and Beat-Adherent Motion
by: Wang, Xuanchen, et al.
Published: (2025)

RAG-VisualRec: An Open Resource for Vision- and Text-Enhanced Retrieval-Augmented Generation in Recommendation
by: Tourani, Ali, et al.
Published: (2025)

DREAM: A Dual Representation Learning Model for Multimodal Recommendation
by: Zhang, Kangning, et al.
Published: (2024)

Break-for-Make: Modular Low-Rank Adaptations for Composable Content-Style Customization
by: Xu, Yu, et al.
Published: (2024)

PF-D2M: A Pose-free Diffusion Model for Universal Dance-to-Music Generation
by: Im, Jaekwon, et al.
Published: (2026)

RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training
by: Ding, Muhe, et al.
Published: (2024)

Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation
by: Fan, Congyi, et al.
Published: (2025)

Towards Alleviating Text-to-Image Retrieval Hallucination for CLIP in Zero-shot Learning
by: Wang, Hanyao, et al.
Published: (2024)

Interactive Multi-Turn Retrieval for Health Videos
by: Wu, Chengzheng, et al.
Published: (2026)

Deep Reversible Consistency Learning for Cross-modal Retrieval
by: Pu, Ruitao, et al.
Published: (2025)

Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification
by: Qin, Yang, et al.
Published: (2025)

Adaptive Offloading and Enhancement for Low-Light Video Analytics on Mobile Devices
by: He, Yuanyi, et al.
Published: (2024)