:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Qi, Zhangyang, Yang, Yunhan, Zhang, Mengchen, Xing, Long, Wu, Xiaoyang, Wu, Tong, Lin, Dahua, Liu, Xihui, Wang, Jiaqi, Zhao, Hengshuang
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2407.06191
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

GPT4Point: A Unified Framework for Point-Language Understanding and Generation
by: Qi, Zhangyang, et al.
Published: (2023)

DreamComposer: Controllable 3D Object Generation via Multi-View Conditions
by: Yang, Yunhan, et al.
Published: (2023)

DreamComposer++: Empowering Diffusion Models with Multi-View Conditions for 3D Content Generation
by: Yang, Yunhan, et al.
Published: (2025)

SS4D: Native 4D Generative Model via Structured Spacetime Latents
by: Li, Zhibing, et al.
Published: (2025)

GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models
by: Qi, Zhangyang, et al.
Published: (2025)

3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models
by: Zhang, Yuhan, et al.
Published: (2025)

Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation
by: Zhang, Mengchen, et al.
Published: (2024)

LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation
by: Yang, Shuai, et al.
Published: (2024)

Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training
by: Wu, Xiaoyang, et al.
Published: (2023)

ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation
by: Zhang, Mengchen, et al.
Published: (2025)

SAMPart3D: Segment Any Part in 3D Objects
by: Yang, Yunhan, et al.
Published: (2024)

OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
by: Huang, Zhening, et al.
Published: (2023)

IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations
by: Li, Zhibing, et al.
Published: (2024)

PhysForge: Generating Physics-Grounded 3D Assets for Interactive Virtual World
by: Yang, Yunhan, et al.
Published: (2026)

LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans
by: Huang, Zhening, et al.
Published: (2025)

VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning
by: Qi, Zhangyang, et al.
Published: (2025)

Point Transformer V3: Simpler, Faster, Stronger
by: Wu, Xiaoyang, et al.
Published: (2023)

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography
by: Zhang, Mengchen, et al.
Published: (2025)

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
by: Zhang, Yujia, et al.
Published: (2025)

HY3D-Bench: Generation of 3D Assets
by: Hunyuan3D, Team, et al.
Published: (2026)

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion
by: Chen, Zhaoxi, et al.
Published: (2024)

Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity
by: Zhang, Yuhan, et al.
Published: (2025)

Velocity-Space 3D Asset Editing
by: Liu, Hao, et al.
Published: (2026)

FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models
by: Wu, Tong, et al.
Published: (2024)

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
by: Peng, Bohao, et al.
Published: (2024)

GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding
by: Wang, Chengyao, et al.
Published: (2024)

Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials
by: Fang, Ye, et al.
Published: (2024)

Articraft: An Agentic System for Scalable Articulated 3D Asset Generation
by: Zhou, Matt, et al.
Published: (2026)

HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
by: Liu, Xian, et al.
Published: (2023)

Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data
by: Sun, Zeyi, et al.
Published: (2024)

SimC3D: A Simple Contrastive 3D Pretraining Framework Using RGB Images
by: Dong, Jiahua, et al.
Published: (2024)

Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation
by: Lin, Jiantao, et al.
Published: (2025)

MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation
by: Huang, Zehuan, et al.
Published: (2024)

Generating Human-AI Collaborative Design Sequence for 3D Assets via Differentiable Operation Graph
by: Huang, Xiaoyang, et al.
Published: (2025)

HoloPart: Generative 3D Part Amodal Segmentation
by: Yang, Yunhan, et al.
Published: (2025)

RefAny3D: 3D Asset-Referenced Diffusion Models for Image Generation
by: Huang, Hanzhuo, et al.
Published: (2026)

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
by: Zhu, Haoyi, et al.
Published: (2023)

Pixel-GS: Density Control with Pixel-aware Gradient for 3D Gaussian Splatting
by: Zhang, Zheng, et al.
Published: (2024)

Utonia: Toward One Encoder for All Point Clouds
by: Zhang, Yujia, et al.
Published: (2026)

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
by: Wu, Tong, et al.
Published: (2024)