:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Du, Bi'an, Liu, Daizong, Li, Pufan, Hu, Wei
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.21557
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing
by: Du, Bi'an, et al.
Published: (2024)

Geometry and Perception Guided Gaussians for Multiview-consistent 3D Generation from a Single Image
by: Li, Pufan, et al.
Published: (2025)

HierOctFusion: Multi-scale Octree-based 3D Shape Generation via Part-Whole-Hierarchy Message Passing
by: Gao, Xinjie, et al.
Published: (2025)

View-Consistent 3D Scene Editing via Dual-Path Structural Correspondense and Semantic Continuity
by: Li, Pufan, et al.
Published: (2026)

Multi-scale Latent Point Consistency Models for 3D Shape Generation
by: Du, Bi'an, et al.
Published: (2024)

Fast3D: Accelerating 3D Multi-modal Large Language Models for Efficient 3D Scene Understanding
by: Huang, Wencan, et al.
Published: (2025)

AugGS: Self-augmented Gaussians with Structural Masks for Sparse-view 3D Reconstruction
by: Du, Bi'an, et al.
Published: (2024)

Joint Top-Down and Bottom-Up Frameworks for 3D Visual Grounding
by: Liu, Yang, et al.
Published: (2024)

Improving the Transferability of 3D Point Cloud Attack via Spectral-aware Admix and Optimization Designs
by: Hu, Shiyu, et al.
Published: (2024)

A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions
by: Liu, Daizong, et al.
Published: (2024)

AdaCo: Overcoming Visual Foundation Model Noise in 3D Semantic Segmentation via Adaptive Label Correction
by: Zou, Pufan, et al.
Published: (2024)

Hard-Label Black-Box Attacks on 3D Point Clouds
by: Liu, Daizong, et al.
Published: (2024)

PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis
by: Jia, Jinrang, et al.
Published: (2026)

A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends
by: Liu, Daizong, et al.
Published: (2024)

Chart Specification: Structural Representations for Incentivizing VLM Reasoning in Chart-to-Code Generation
by: He, Minggui, et al.
Published: (2026)

Representing Part-Whole Hierarchies in Foundation Models by Learning Localizability, Composability, and Decomposability from Anatomy via Self-Supervision
by: Taher, Mohammad Reza Hosseinzadeh, et al.
Published: (2024)

Recursive Neural Programs: Variational Learning of Image Grammars and Part-Whole Hierarchies
by: Fisher, Ares, et al.
Published: (2022)

UniPart: Part-Level 3D Generation with Unified 3D Geom-Seg Latents
by: He, Xufan, et al.
Published: (2025)

OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion
by: Yang, Yunhan, et al.
Published: (2025)

Behave Your Motion: Habit-preserved Cross-category Animal Motion Transfer
by: Zhang, Zhimin, et al.
Published: (2025)

SP3D: Boosting Sparsely-Supervised 3D Object Detection via Accurate Cross-Modal Semantic Prompts
by: Zhao, Shijia, et al.
Published: (2025)

From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation
by: Huang, Zehuan, et al.
Published: (2024)

SegviGen: Repurposing 3D Generative Model for Part Segmentation
by: Li, Lin, et al.
Published: (2026)

OneWorld: Taming Scene Generation with 3D Unified Representation Autoencoder
by: Gao, Sensen, et al.
Published: (2026)

From 2D to 3D Cognition: A Brief Survey of General World Models
by: Xie, Ningwei, et al.
Published: (2025)

An Image Is Worth Ten Thousand Words: Verbose-Text Induction Attacks on VLMs
by: Luo, Zhi, et al.
Published: (2025)

PAFUSE: Part-based Diffusion for 3D Whole-Body Pose Estimation
by: Samet, Nermin, et al.
Published: (2024)

HoloPart: Generative 3D Part Amodal Segmentation
by: Yang, Yunhan, et al.
Published: (2025)

Rethinking Video-Language Model from the Language Input Perspective
by: Fang, Xiang, et al.
Published: (2026)

From One to More: Contextual Part Latents for 3D Generation
by: Dong, Shaocong, et al.
Published: (2025)

UrbanWorld: An Urban World Model for 3D City Generation
by: Shang, Yu, et al.
Published: (2024)

CoSMo3D: Open-World Promptable 3D Semantic Part Segmentation through LLM-Guided Canonical Spatial Modeling
by: Jin, Li, et al.
Published: (2026)

HOLODECK 2.0: Vision-Language-Guided 3D World Generation with Editing
by: Bian, Zixuan, et al.
Published: (2025)

HumanOrbit: 3D Human Reconstruction as 360° Orbit Generation
by: Suzuki, Keito, et al.
Published: (2026)

MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls
by: Bian, Yuxuan, et al.
Published: (2024)

Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
by: Liu, Yi, et al.
Published: (2024)

OmniMotion-X: Versatile Multimodal Whole-Body Motion Generation
by: Xu, Guowei, et al.
Published: (2025)

Turbo3D: Ultra-fast Text-to-3D Generation
by: Hu, Hanzhe, et al.
Published: (2024)

Hierarchy-Aware Fine-Tuning of Vision-Language Models
by: Li, Jiayu, et al.
Published: (2025)

WorldGrow: Generating Infinite 3D World
by: Li, Sikuang, et al.
Published: (2025)