:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Sirui, Wang, Ziyin, Wang, Yu-Xiong, Gui, Liang-Yan
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2403.19652
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

InterAct: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation
by: Xu, Sirui, et al.
Published: (2025)

InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions
by: Xu, Sirui, et al.
Published: (2025)

Unleashing Guidance Without Classifiers for Human-Object Interaction Animation
by: Wang, Ziyin, et al.
Published: (2026)

InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions
by: Xu, Sirui, et al.
Published: (2026)

ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation
by: Zhou, Yuan, et al.
Published: (2025)

Zero-Shot Human-Object Interaction Synthesis with Multimodal Priors
by: Lou, Yuke, et al.
Published: (2025)

PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation
by: Zhang, Tianyuan, et al.
Published: (2024)

HandX: Scaling Bimanual Motion and Interaction Generation
by: Zhang, Zimu, et al.
Published: (2026)

TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation
by: Liu, Yufei, et al.
Published: (2024)

MoReact: Generating Reactive Motion from Textual Descriptions
by: Xu, Xiyan, et al.
Published: (2025)

InterFusion: Text-Driven Generation of 3D Human-Object Interaction
by: Dai, Sisi, et al.
Published: (2024)

3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation
by: Zhang, Frank, et al.
Published: (2024)

HASSOD: Hierarchical Adaptive Self-Supervised Object Detection
by: Cao, Shengcao, et al.
Published: (2024)

Situational Awareness Matters in 3D Vision Language Reasoning
by: Man, Yunze, et al.
Published: (2024)

ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory
by: Li, Ying, et al.
Published: (2025)

InteractMove: Text-Controlled Human-Object Interaction Generation in 3D Scenes with Movable Objects
by: Cai, Xinhao, et al.
Published: (2025)

GenHOI: Generalizing Text-driven 4D Human-Object Interaction Synthesis for Unseen Objects
by: Li, Shujia, et al.
Published: (2025)

ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing
by: Chen, Jun-Kun, et al.
Published: (2024)

StoryBlender: Inter-Shot Consistent and Editable 3D Storyboard with Spatial-temporal Dynamics
by: Li, Bingliang, et al.
Published: (2026)

DualCross: Cross-Modality Cross-Domain Adaptation for Monocular BEV Perception
by: Man, Yunze, et al.
Published: (2023)

Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models
by: Xiong, Lexiang, et al.
Published: (2025)

ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation
by: Ma, Zhiyuan, et al.
Published: (2024)

HandDreamer: Zero-Shot Text to 3D Hand Model Generation using Corrective Hand Shape Guidance
by: Rosh, Green, et al.
Published: (2026)

Interact-Custom: Customized Human Object Interaction Image Generation
by: Xu, Zhu, et al.
Published: (2025)

Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision
by: Cao, Shengcao, et al.
Published: (2024)

GraphicsDreamer: Image to 3D Generation with Physical Consistency
by: Chen, Pei, et al.
Published: (2024)

Text-guided Zero-Shot Object Localization
by: Wang, Jingjing, et al.
Published: (2024)

InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images
by: Hoe, Jiun Tian, et al.
Published: (2025)

Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language Models
by: Yu, Lu, et al.
Published: (2024)

Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions
by: Zhu, He, et al.
Published: (2025)

LAGO: Language-Guided Adaptive Object-Region Focus for Zero-Shot Visual-Text Alignment
by: Hu, Junyi, et al.
Published: (2026)

Locality-Aware Zero-Shot Human-Object Interaction Detection
by: Kim, Sanghyun, et al.
Published: (2025)

A Review of Human-Object Interaction Detection
by: Wang, Yuxiao, et al.
Published: (2024)

PlacidDreamer: Advancing Harmony in Text-to-3D Generation
by: Huang, Shuo, et al.
Published: (2024)

SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models
by: Zantout, Nader, et al.
Published: (2025)

Articulate3D: Zero-Shot Text-Driven 3D Object Posing
by: Deb, Oishi, et al.
Published: (2025)

Zero-Shot Temporal Interaction Localization for Egocentric Videos
by: Zhang, Erhang, et al.
Published: (2025)

BrightDreamer: Generic 3D Gaussian Generative Framework for Fast Text-to-3D Synthesis
by: Jiang, Lutao, et al.
Published: (2024)

Hybrid Discriminative Attribute-Object Embedding Network for Compositional Zero-Shot Learning
by: Liu, Yang, et al.
Published: (2024)

Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding
by: Wang, Haibo, et al.
Published: (2026)