Saved in:
| Main Authors: | Wu, Yuheng, Gao, Xiangbo, Chen, Tianhao, Chen, Xinghao, Yin, Qing, Tu, Zhengzhong, Lee, Dongman |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.14382 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing
by: Lionar, Stefan, et al.
Published: (2025)
by: Lionar, Stefan, et al.
Published: (2025)
Cert-LAS: Toward Certified Model Ownership Verification for Text-to-Image Diffusion Models via Layer-Adaptive Smoothing
by: Qi, Leyi, et al.
Published: (2026)
by: Qi, Leyi, et al.
Published: (2026)
Unison: Harmonizing Motion, Speech, and Sound for Human-Centric Audio-Video Generation
by: Cheng, Shihao, et al.
Published: (2026)
by: Cheng, Shihao, et al.
Published: (2026)
Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation
by: He, Liu, et al.
Published: (2024)
by: He, Liu, et al.
Published: (2024)
Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation
by: Lin, Jiantao, et al.
Published: (2025)
by: Lin, Jiantao, et al.
Published: (2025)
VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness
by: Cha, SeungJu, et al.
Published: (2025)
by: Cha, SeungJu, et al.
Published: (2025)
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
by: Guan, Jiazhi, et al.
Published: (2025)
by: Guan, Jiazhi, et al.
Published: (2025)
Casual3DHDR: Deblurring High Dynamic Range 3D Gaussian Splatting from Casually Captured Videos
by: Gong, Shucheng, et al.
Published: (2025)
by: Gong, Shucheng, et al.
Published: (2025)
InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models
by: Hoe, Jiun Tian, et al.
Published: (2023)
by: Hoe, Jiun Tian, et al.
Published: (2023)
InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images
by: Hoe, Jiun Tian, et al.
Published: (2025)
by: Hoe, Jiun Tian, et al.
Published: (2025)
STAR: Skeleton-aware Text-based 4D Avatar Generation with In-Network Motion Retargeting
by: Chai, Zenghao, et al.
Published: (2024)
by: Chai, Zenghao, et al.
Published: (2024)
Unveiling Deep Shadows: A Survey and Benchmark on Image and Video Shadow Detection, Removal, and Generation in the Deep Learning Era
by: Hu, Xiaowei, et al.
Published: (2024)
by: Hu, Xiaowei, et al.
Published: (2024)
DreamCinema: Cinematic Transfer with Free Camera and 3D Character
by: Chen, Weiliang, et al.
Published: (2024)
by: Chen, Weiliang, et al.
Published: (2024)
Representing Long Volumetric Video with Temporal Gaussian Hierarchy
by: Xu, Zhen, et al.
Published: (2024)
by: Xu, Zhen, et al.
Published: (2024)
FairyGen: Storied Cartoon Video from a Single Child-Drawn Character
by: Zheng, Jiayi, et al.
Published: (2025)
by: Zheng, Jiayi, et al.
Published: (2025)
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning
by: Girdhar, Rohit, et al.
Published: (2023)
by: Girdhar, Rohit, et al.
Published: (2023)
Neural Network-Based Tracking and 3D Reconstruction of Baseball Pitch Trajectories from Single-View 2D Video
by: Hsieh, Jhen
Published: (2024)
by: Hsieh, Jhen
Published: (2024)
EditYourself: Audio-Driven Generation and Manipulation of Talking Head Videos with Diffusion Transformers
by: Flynn, John, et al.
Published: (2026)
by: Flynn, John, et al.
Published: (2026)
MusicScore: A Dataset for Music Score Modeling and Generation
by: Lin, Yuheng, et al.
Published: (2024)
by: Lin, Yuheng, et al.
Published: (2024)
Sound Sparks Motion: Audio and Text Tuning for Video Editing
by: Razlighi, AmirHossein Naghi, et al.
Published: (2026)
by: Razlighi, AmirHossein Naghi, et al.
Published: (2026)
ImagenHub: Standardizing the evaluation of conditional image generation models
by: Ku, Max, et al.
Published: (2023)
by: Ku, Max, et al.
Published: (2023)
HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation
by: Dong, Wenqi, et al.
Published: (2025)
by: Dong, Wenqi, et al.
Published: (2025)
ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
by: Guan, Jiazhi, et al.
Published: (2024)
by: Guan, Jiazhi, et al.
Published: (2024)
PersonaGest: Personalized Co-Speech Gesture Generation with Semantic-Guided Hierarchical Motion Representation
by: Zhao, Junchuan, et al.
Published: (2026)
by: Zhao, Junchuan, et al.
Published: (2026)
Laplacian Analysis Meets Dynamics Modelling: Gaussian Splatting for 4D Reconstruction
by: Zhou, Yifan, et al.
Published: (2025)
by: Zhou, Yifan, et al.
Published: (2025)
SVGS: Enhancing Gaussian Splatting Using Primitives with Spatially Varying Colors
by: Xu, Rui, et al.
Published: (2024)
by: Xu, Rui, et al.
Published: (2024)
DrawVideo: Generating Long Video from Storyboard Keyframe Sketches
by: Xu, Chuanzhi, et al.
Published: (2026)
by: Xu, Chuanzhi, et al.
Published: (2026)
Break-for-Make: Modular Low-Rank Adaptations for Composable Content-Style Customization
by: Xu, Yu, et al.
Published: (2024)
by: Xu, Yu, et al.
Published: (2024)
MesonGS++: Post-training Compression of 3D Gaussian Splatting with Hyperparameter Searching
by: Xie, Shuzhao, et al.
Published: (2026)
by: Xie, Shuzhao, et al.
Published: (2026)
SIG-Chat: Spatial Intent-Guided Conversational Gesture Generation Involving How, When and Where
by: Huang, Yiheng, et al.
Published: (2025)
by: Huang, Yiheng, et al.
Published: (2025)
ChoreoMuse: Robust Music-to-Dance Video Generation with Style Transfer and Beat-Adherent Motion
by: Wang, Xuanchen, et al.
Published: (2025)
by: Wang, Xuanchen, et al.
Published: (2025)
MDD: A Dataset for Text-and-Music Conditioned Duet Dance Generation
by: Gupta, Prerit, et al.
Published: (2025)
by: Gupta, Prerit, et al.
Published: (2025)
A Survey on 3D Gaussian Splatting
by: Chen, Guikun, et al.
Published: (2024)
by: Chen, Guikun, et al.
Published: (2024)
DanceEditor: Towards Iterative Editable Music-driven Dance Generation with Open-Vocabulary Descriptions
by: Zhang, Hengyuan, et al.
Published: (2025)
by: Zhang, Hengyuan, et al.
Published: (2025)
SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation
by: Pham, Kien T., et al.
Published: (2025)
by: Pham, Kien T., et al.
Published: (2025)
Improving Generative Adversarial Network Generalization for Facial Expression Synthesis
by: Akram, Arbish, et al.
Published: (2026)
by: Akram, Arbish, et al.
Published: (2026)
Perceive-Sample-Compress: Towards Real-Time 3D Gaussian Splatting
by: Wang, Zijian, et al.
Published: (2025)
by: Wang, Zijian, et al.
Published: (2025)
Splatography: Sparse multi-view dynamic Gaussian Splatting for filmmaking challenges
by: Azzarelli, Adrian, et al.
Published: (2025)
by: Azzarelli, Adrian, et al.
Published: (2025)
altiro3D: Scene representation from single image and novel view synthesis
by: Canessa, E., et al.
Published: (2023)
by: Canessa, E., et al.
Published: (2023)
Exploring Palette based Color Guidance in Diffusion Models
by: Qiu, Qianru, et al.
Published: (2025)
by: Qiu, Qianru, et al.
Published: (2025)
Similar Items
-
TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing
by: Lionar, Stefan, et al.
Published: (2025) -
Cert-LAS: Toward Certified Model Ownership Verification for Text-to-Image Diffusion Models via Layer-Adaptive Smoothing
by: Qi, Leyi, et al.
Published: (2026) -
Unison: Harmonizing Motion, Speech, and Sound for Human-Centric Audio-Video Generation
by: Cheng, Shihao, et al.
Published: (2026) -
Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation
by: He, Liu, et al.
Published: (2024) -
Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation
by: Lin, Jiantao, et al.
Published: (2025)