:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Yuelei, Kim, Hyunjin, Zhan, Fangneng, Qiu, Ri-Zhao, Ji, Mazeyu, Shan, Xiaojun, Zou, Xueyan, Liang, Paul, Pfister, Hanspeter, Wang, Xiaolong
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2503.24270
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

GraspSplats: Efficient Manipulation with 3D Feature Splatting
by: Ji, Mazeyu, et al.
Published: (2024)

AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance
by: Xu, Tianling, et al.
Published: (2025)

GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation
by: Jiang, Guangqi, et al.
Published: (2025)

Abstract 3D Perception for Spatial Intelligence in Vision-Language Models
by: Liu, Yifan, et al.
Published: (2025)

RoboTAG: End-to-end Robot Configuration Estimation via Topological Alignment Graph
by: Liu, Yifan, et al.
Published: (2025)

GeCo: Evaluating Geometric Consistency for Video Generation via Motion and Structure
by: Gu, Leslie, et al.
Published: (2025)

WildLMa: Long Horizon Loco-Manipulation in the Wild
by: Qiu, Ri-Zhao, et al.
Published: (2024)

M3: 3D-Spatial MultiModal Memory
by: Zou, Xueyan, et al.
Published: (2025)

Advances in Feed-Forward 3D Reconstruction and View Synthesis: A Survey
by: Zhang, Jiahui, et al.
Published: (2025)

General Neural Gauge Fields
by: Zhan, Fangneng, et al.
Published: (2023)

Lite2Relight: 3D-aware Single Image Portrait Relighting
by: Rao, Pramod, et al.
Published: (2024)

CTRL-GS: Cascaded Temporal Residue Learning for 4D Gaussian Splatting
by: Hou, Karly, et al.
Published: (2025)

Gaussian-Augmented Physics Simulation and System Identification with Complex Colliders
by: Vasile, Federico, et al.
Published: (2025)

Visual Whole-Body Control for Legged Loco-Manipulation
by: Liu, Minghuan, et al.
Published: (2024)

DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields
by: Chi, Yu, et al.
Published: (2023)

Stream3D: Sequential Multi-View 3D Generation via Evidential Memory
by: Zhou, Kaichen, et al.
Published: (2026)

LoRA-TTT: Low-Rank Test-Time Training for Vision-Language Models
by: Kojima, Yuto, et al.
Published: (2025)

When Visuals Aren't the Problem: Evaluating Vision-Language Models on Misleading Data Visualizations
by: Lalai, Harsh Nishant, et al.
Published: (2026)

Joint-Task Regularization for Partially Labeled Multi-Task Learning
by: Nishi, Kento, et al.
Published: (2024)

3DPR: Single Image 3D Portrait Relight using Generative Priors
by: Rao, Pramod, et al.
Published: (2025)

MoRA: LoRA Guided Multi-Modal Disease Diagnosis with Missing Modality
by: Shi, Zhiyi, et al.
Published: (2024)

DiffAge3D: Diffusion-based 3D-aware Face Aging
by: Wahid, Junaid, et al.
Published: (2024)

SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting
by: Zhang, Jiahui, et al.
Published: (2025)

LangFlash: Feed-forward 3D Language Gaussian Splatting from Sparse Unposed Images
by: Liu, Yilong, et al.
Published: (2026)

RiGS: Rigid-aware 4D Gaussian Splatting from a Single Monocular Video
by: Wu, Chenyu, et al.
Published: (2026)

Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion
by: He, Jixuan, et al.
Published: (2024)

Understanding Graphical Perception in Data Visualization through Zero-shot Prompting of Vision-Language Models
by: Guo, Grace, et al.
Published: (2024)

Visual Instruction-Finetuned Language Model for Versatile Brain MR Image Tasks
by: Kim, Jonghun, et al.
Published: (2026)

Integrating LMM Planners and 3D Skill Policies for Generalizable Manipulation
by: Li, Yuelei, et al.
Published: (2025)

LangSplat: 3D Language Gaussian Splatting
by: Qin, Minghan, et al.
Published: (2023)

Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing
by: Qiu, Ri-Zhao, et al.
Published: (2024)

Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment
by: Senocak, Arda, et al.
Published: (2024)

Towards 1000-fold Electron Microscopy Image Compression for Connectomics via VQ-VAE with Transformer Prior
by: Yang, Fuming, et al.
Published: (2025)

MixLight: Borrowing the Best of both Spherical Harmonics and Gaussian Models
by: Ji, Xinlong, et al.
Published: (2024)

DualEdit: Dual Editing for Knowledge Updating in Vision-Language Models
by: Shi, Zhiyi, et al.
Published: (2025)

FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization
by: Zhang, Jiahui, et al.
Published: (2024)

MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation
by: Xu, Muyu, et al.
Published: (2025)

Tree of Attributes Prompt Learning for Vision-Language Models
by: Ding, Tong, et al.
Published: (2024)

PAGE-4D: VGGT-4D Perception via Disentangled Pose and Geometry Estimation
by: Zhou, Kaichen, et al.
Published: (2025)

Is What You Ask For What You Get? Investigating Concept Associations in Text-to-Image Models
by: Magid, Salma Abdel, et al.
Published: (2024)