:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Dong, Jiahua, Wu, Tong, Qian, Rui, Wang, Jiaqi
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2412.05274
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

3DGS-Drag: Dragging Gaussians for Intuitive Point-Based 3D Editing
by: Dong, Jiahua, et al.
Published: (2026)

SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing
by: Qian, Qi, et al.
Published: (2024)

ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields
by: Dong, Jiahua, et al.
Published: (2024)

A Survey on RGB, 3D, and Multimodal Approaches for Unsupervised Industrial Image Anomaly Detection
by: Lin, Yuxuan, et al.
Published: (2024)

Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images
by: Qi, Zhangyang, et al.
Published: (2024)

Non-Invasive 3D Wound Measurement with RGB-D Imaging
by: Harkämper, Lena, et al.
Published: (2026)

LinK3D: Linear Keypoints Representation for 3D LiDAR Point Cloud
by: Cui, Yunge, et al.
Published: (2022)

IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images
by: Wu, Yushuang, et al.
Published: (2024)

Optimizing 4D Wires for Sparse 3D Abstraction
by: Wu, Dong-Yi, et al.
Published: (2026)

3D-MVP: 3D Multiview Pretraining for Robotic Manipulation
by: Qian, Shengyi, et al.
Published: (2024)

SCas4D: Structural Cascaded Optimization for Boosting Persistent 4D Novel View Synthesis
by: Lyu, Jipeng, et al.
Published: (2025)

SEED: A Simple and Effective 3D DETR in Point Clouds
by: Liu, Zhe, et al.
Published: (2024)

Pix4Point: Image Pretrained Standard Transformers for 3D Point Cloud Understanding
by: Qian, Guocheng, et al.
Published: (2022)

R3DPA: Leveraging 3D Representation Alignment and RGB Pretrained Priors for LiDAR Scene Generation
by: Sereyjol-Garros, Nicolas, et al.
Published: (2026)

SEAR: Simple and Efficient Adaptation of Visual Geometric Transformers for RGB+Thermal 3D Reconstruction
by: Skorokhodov, Vsevolod, et al.
Published: (2026)

GS-CLIP: Gaussian Splatting for Contrastive Language-Image-3D Pretraining from Real-World Data
by: Li, Haoyuan, et al.
Published: (2024)

Sim-to-Real Grasp Detection with Global-to-Local RGB-D Adaptation
by: Ma, Haoxiang, et al.
Published: (2024)

RGB2Point: 3D Point Cloud Generation from Single RGB Images
by: Lee, Jae Joong, et al.
Published: (2024)

SimToken: A Simple Baseline for Referring Audio-Visual Segmentation
by: Jin, Dian, et al.
Published: (2025)

SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations
by: Wandel, Krispin, et al.
Published: (2025)

3D Instance Segmentation Using Deep Learning on RGB-D Indoor Data
by: Yasir, Siddiqui Muhammad, et al.
Published: (2024)

Gesture-Aware Pretraining and Token Fusion for 3D Hand Pose Estimation
by: Hong, Rui, et al.
Published: (2026)

3D-LSPTM: An Automatic Framework with 3D-Large-Scale Pretrained Model for Laryngeal Cancer Detection Using Laryngoscopic Videos
by: Qiu, Meiyu, et al.
Published: (2024)

A Modular Pipeline for 3D Object Tracking Using RGB Cameras
by: Bredereke, Lars, et al.
Published: (2025)

Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training
by: Gao, Yipeng, et al.
Published: (2023)

Learning to Recover Spectral Reflectance from RGB Images
by: Huo, Dong, et al.
Published: (2023)

GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving
by: Xu, Shaoqing, et al.
Published: (2024)

SegFly: A 2D-3D-2D Paradigm for Aerial RGB-Thermal Semantic Segmentation at Scale
by: Gross, Markus, et al.
Published: (2026)

TransDiff: Diffusion-Based Method for Manipulating Transparent Objects Using a Single RGB-D Image
by: Wang, Haoxiao, et al.
Published: (2025)

BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image
by: Chu, Tao, et al.
Published: (2023)

3D-UGCN: A Unified Graph Convolutional Network for Robust 3D Human Pose Estimation from Monocular RGB Images
by: Zhao, Jie, et al.
Published: (2024)

RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for Real-world Applications
by: Liu, Xingyu, et al.
Published: (2024)

Coherent 3D Scene Diffusion From a Single RGB Image
by: Dahnert, Manuel, et al.
Published: (2024)

CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image
by: Yao, Kaixin, et al.
Published: (2025)

Contrastive Language-Colored Pointmap Pretraining for Unified 3D Scene Understanding
by: Mao, Ye, et al.
Published: (2026)

Time-to-Event Pretraining for 3D Medical Imaging
by: Huo, Zepeng, et al.
Published: (2024)

TrackAny3D: Transferring Pretrained 3D Models for Category-unified 3D Point Cloud Tracking
by: Wang, Mengmeng, et al.
Published: (2025)

3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining
by: Yan, Siming, et al.
Published: (2023)

Enhancing MLLM Spatial Understanding via Active 3D Scene Exploration for Multi-Perspective Reasoning
by: Chen, Jiahua, et al.
Published: (2026)

3D CoCa: Contrastive Learners are 3D Captioners
by: Huang, Ting, et al.
Published: (2025)