Saved in:
| Main Authors: | Guan, Jiazhi, Yang, Quanwei, Huang, Luying, Liang, Junhao, Liang, Borong, Feng, Haocheng, He, Wei, Wang, Kaisiyuan, Zhou, Hang, Wang, Jingdong |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.09883 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ONE-SHOT: Compositional Human-Environment Video Synthesis via Spatial-Decoupled Motion Injection and Hybrid Context Integration
by: Yang, Fengyuan, et al.
Published: (2026)
by: Yang, Fengyuan, et al.
Published: (2026)
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
by: Sun, Yasheng, et al.
Published: (2025)
by: Sun, Yasheng, et al.
Published: (2025)
InterDyad: Interactive Dyadic Speech-to-Video Generation by Querying Intermediate Visual Guidance
by: Pan, Dongwei, et al.
Published: (2026)
by: Pan, Dongwei, et al.
Published: (2026)
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
by: Guan, Jiazhi, et al.
Published: (2025)
by: Guan, Jiazhi, et al.
Published: (2025)
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model
by: Fan, Yingying, et al.
Published: (2025)
by: Fan, Yingying, et al.
Published: (2025)
GestureHYDRA: Semantic Co-speech Gesture Synthesis via Hybrid Modality Diffusion Transformer and Cascaded-Synchronized Retrieval-Augmented Generation
by: Yang, Quanwei, et al.
Published: (2025)
by: Yang, Quanwei, et al.
Published: (2025)
ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
by: Guan, Jiazhi, et al.
Published: (2024)
by: Guan, Jiazhi, et al.
Published: (2024)
TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
by: Guan, Jiazhi, et al.
Published: (2024)
by: Guan, Jiazhi, et al.
Published: (2024)
iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer
by: Shen, Zhelun, et al.
Published: (2025)
by: Shen, Zhelun, et al.
Published: (2025)
GenHOI: Towards Object-Consistent Hand-Object Interaction with Temporally Balanced and Spatially Selective Object Injection
by: Huang, Xuan, et al.
Published: (2026)
by: Huang, Xuan, et al.
Published: (2026)
MVHOI: Bridge Multi-view Condition to Complex Human-Object Interaction Video Reenactment via 3D Foundation Model
by: Tong, Jinguang, et al.
Published: (2026)
by: Tong, Jinguang, et al.
Published: (2026)
Unleashing Guidance Without Classifiers for Human-Object Interaction Animation
by: Wang, Ziyin, et al.
Published: (2026)
by: Wang, Ziyin, et al.
Published: (2026)
RefAlign: Representation Alignment for Reference-to-Video Generation
by: Wang, Lei, et al.
Published: (2026)
by: Wang, Lei, et al.
Published: (2026)
SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing
by: Zhang, Xinyao, et al.
Published: (2026)
by: Zhang, Xinyao, et al.
Published: (2026)
MUMMIES ON DISPLAY: CONSERVATION CONSIDERATIONS
by: Debra Meier
Published: (2001)
by: Debra Meier
Published: (2001)
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
by: Li, Quanhao, et al.
Published: (2025)
by: Li, Quanhao, et al.
Published: (2025)
Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion
by: Yang, Shiyuan, et al.
Published: (2024)
by: Yang, Shiyuan, et al.
Published: (2024)
Efficient and Scalable Monocular Human-Object Interaction Motion Reconstruction
by: Wen, Boran, et al.
Published: (2025)
by: Wen, Boran, et al.
Published: (2025)
VHOI: Controllable Video Generation of Human-Object Interactions from Sparse Trajectories via Motion Densification
by: Zhang, Wanyue, et al.
Published: (2025)
by: Zhang, Wanyue, et al.
Published: (2025)
TRACE: Object Motion Editing in Videos with First-Frame Trajectory Guidance
by: Phung, Quynh, et al.
Published: (2026)
by: Phung, Quynh, et al.
Published: (2026)
Motion-aware Memory Network for Fast Video Salient Object Detection
by: Zhao, Xing, et al.
Published: (2022)
by: Zhao, Xing, et al.
Published: (2022)
DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance
by: Wang, Cong, et al.
Published: (2023)
by: Wang, Cong, et al.
Published: (2023)
Point-to-Point: Sparse Motion Guidance for Controllable Video Editing
by: Song, Yeji, et al.
Published: (2025)
by: Song, Yeji, et al.
Published: (2025)
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation
by: Sun, Yasheng, et al.
Published: (2024)
by: Sun, Yasheng, et al.
Published: (2024)
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
by: Zhang, Yuang, et al.
Published: (2024)
by: Zhang, Yuang, et al.
Published: (2024)
Accelerate Solving Expensive Scheduling by Leveraging Economical Auxiliary Tasks
by: Li, Minshuo, et al.
Published: (2024)
by: Li, Minshuo, et al.
Published: (2024)
Momentum Auxiliary Network for Supervised Local Learning
by: Su, Junhao, et al.
Published: (2024)
by: Su, Junhao, et al.
Published: (2024)
Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion
by: Guzov, Vladimir, et al.
Published: (2022)
by: Guzov, Vladimir, et al.
Published: (2022)
Object Concepts Emerge from Motion
by: Liang, Haoqian, et al.
Published: (2025)
by: Liang, Haoqian, et al.
Published: (2025)
GVA: Reconstructing Vivid 3D Gaussian Avatars from Monocular Videos
by: Liu, Xinqi, et al.
Published: (2024)
by: Liu, Xinqi, et al.
Published: (2024)
FlowMotion: Training-Free Flow Guidance for Video Motion Transfer
by: Wang, Zhen, et al.
Published: (2026)
by: Wang, Zhen, et al.
Published: (2026)
Point2Insert: Video Object Insertion via Sparse Point Guidance
by: Zhou, Yu, et al.
Published: (2026)
by: Zhou, Yu, et al.
Published: (2026)
RealisMotion: Decomposed Human Motion Control and Video Generation in the World Space
by: Liang, Jingyun, et al.
Published: (2025)
by: Liang, Jingyun, et al.
Published: (2025)
SERIAL DRAWING IN GIRLS WHO DISPLAY OPPOSITIONAL DEFIANT BEHAVIOR IN THE CLASSROOM
by: Andréia Mansk Boone Salles
Published: (2015)
by: Andréia Mansk Boone Salles
Published: (2015)
Proactive Recommendation with Iterative Preference Guidance
by: Bi, Shuxian, et al.
Published: (2024)
by: Bi, Shuxian, et al.
Published: (2024)
Auxiliary Discrminator Sequence Generative Adversarial Networks (ADSeqGAN) for Few Sample Molecule Generation
by: Tang, Haocheng, et al.
Published: (2025)
by: Tang, Haocheng, et al.
Published: (2025)
InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction
by: Xu, Sirui, et al.
Published: (2024)
by: Xu, Sirui, et al.
Published: (2024)
CoMoVi: Co-Generation of 3D Human Motions and Realistic Videos
by: Zhao, Chengfeng, et al.
Published: (2026)
by: Zhao, Chengfeng, et al.
Published: (2026)
iTryOn: Mastering Interactive Video Virtual Try-On with Spatial-Semantic Guidance
by: Zheng, Jun, et al.
Published: (2026)
by: Zheng, Jun, et al.
Published: (2026)
On the Robustness of Human-Object Interaction Detection against Distribution Shift
by: Xie, Chi, et al.
Published: (2025)
by: Xie, Chi, et al.
Published: (2025)
Similar Items
-
ONE-SHOT: Compositional Human-Environment Video Synthesis via Spatial-Decoupled Motion Injection and Hybrid Context Integration
by: Yang, Fengyuan, et al.
Published: (2026) -
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
by: Sun, Yasheng, et al.
Published: (2025) -
InterDyad: Interactive Dyadic Speech-to-Video Generation by Querying Intermediate Visual Guidance
by: Pan, Dongwei, et al.
Published: (2026) -
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
by: Guan, Jiazhi, et al.
Published: (2025) -
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model
by: Fan, Yingying, et al.
Published: (2025)