:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yuan, Qihao, Li, Kailai, Zhang, Jiaming
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2411.14594
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
by: Li, Rong, et al.
Published: (2024)

Zero-Shot 3D Visual Grounding from Vision-Language Models
by: Li, Rong, et al.
Published: (2025)

Z3D: Zero-Shot 3D Visual Grounding from Images
by: Drozdov, Nikita, et al.
Published: (2026)

Zero-Shot Visual Grounding in 3D Gaussians via View Retrieval
by: Liao, Liwei, et al.
Published: (2025)

Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
by: Yuan, Zhihao, et al.
Published: (2023)

Seeing Beyond Classes: Zero-Shot Grounded Situation Recognition via Language Explainer
by: Lei, Jiaming, et al.
Published: (2024)

Multiple Consistent 2D-3D Mappings for Robust Zero-Shot 3D Visual Grounding
by: Yin, Yufei, et al.
Published: (2026)

VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
by: Xu, Runsen, et al.
Published: (2024)

SceneGraphGrounder: Zero-Shot 3D Visual Grounding via Structured Scene Graph Matching
by: Sun, Xuefei, et al.
Published: (2026)

SeqVLM: Proposal-Guided Multi-View Sequences Reasoning via VLM for Zero-Shot 3D Visual Grounding
by: Lin, Jiawen, et al.
Published: (2025)

AgentGrounder: Zero-Shot 3D Visual Pointcloud Grounding using Multimodal Language Models
by: Huynh, Cuong, et al.
Published: (2026)

Grounding Descriptions in Images informs Zero-Shot Visual Recognition
by: Halbe, Shaunak, et al.
Published: (2024)

3DWG: 3D Weakly Supervised Visual Grounding via Category and Instance-Level Alignment
by: Li, Xiaoqi, et al.
Published: (2025)

pySpatial: Generating 3D Visual Programs for Zero-Shot Spatial Reasoning
by: Luo, Zhanpeng, et al.
Published: (2026)

Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding
by: Wang, Haibo, et al.
Published: (2026)

ZONE: Zero-Shot Instruction-Guided Local Editing
by: Li, Shanglin, et al.
Published: (2023)

ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
by: Min, Yunhong, et al.
Published: (2025)

Distributed Zero-Shot Learning for Visual Recognition
by: Chen, Zhi, et al.
Published: (2025)

RSVG-ZeroOV: Exploring a Training-Free Framework for Zero-Shot Open-Vocabulary Visual Grounding in Remote Sensing Images
by: Li, Ke, et al.
Published: (2025)

Zero-P-to-3: Zero-Shot Partial-View Images to 3D Object
by: Lin, Yuxuan, et al.
Published: (2025)

Attend and Enrich: Enhanced Visual Prompt for Zero-Shot Learning
by: Liu, Man, et al.
Published: (2024)

Learning Visual Proxy for Compositional Zero-Shot Learning
by: Zhang, Shiyu, et al.
Published: (2025)

$\text{H}^2$em: Learning Hierarchical Hyperbolic Embeddings for Compositional Zero-Shot Learning
by: Li, Lin, et al.
Published: (2025)

ZeroMamba: Exploring Visual State Space Model for Zero-Shot Learning
by: Hou, Wenjin, et al.
Published: (2024)

Weakly-Supervised 3D Visual Grounding based on Visual Language Alignment
by: Xu, Xiaoxu, et al.
Published: (2023)

SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding
by: Jin, Zhao, et al.
Published: (2025)

Visual Grounding with Attention-Driven Constraint Balancing
by: Kang, Weitai, et al.
Published: (2024)

SVIP: Semantically Contextualized Visual Patches for Zero-Shot Learning
by: Chen, Zhi, et al.
Published: (2025)

Neuro-3D: Towards 3D Visual Decoding from EEG Signals
by: Guo, Zhanqiang, et al.
Published: (2024)

DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data
by: Liu, Qihao, et al.
Published: (2024)

OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging
by: Tang, Yijie, et al.
Published: (2025)

Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning
by: Jiang, Huajie, et al.
Published: (2025)

Fine-Grained Zero-Shot Composed Image Retrieval with Complementary Visual-Semantic Integration
by: Ye, Yongcong, et al.
Published: (2026)

RefChartQA: Grounding Visual Answer on Chart Images through Instruction Tuning
by: Vogel, Alexander, et al.
Published: (2025)

Open-Pose 3D Zero-Shot Learning: Benchmark and Challenges
by: Zhao, Weiguang, et al.
Published: (2023)

ChangingGrounding: 3D Visual Grounding in Changing Scenes
by: Hu, Miao, et al.
Published: (2025)

StoryTailor:A Zero-Shot Pipeline for Action-Rich Multi-Subject Visual Narratives
by: Hu, Jinghao, et al.
Published: (2026)

ReplicateAnyScene: Zero-Shot Video-to-3D Composition via Textual-Visual-Spatial Alignment
by: Dong, Mingyu, et al.
Published: (2026)

SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models
by: Zantout, Nader, et al.
Published: (2025)

Zero-Shot 4D Lidar Panoptic Segmentation
by: Zhang, Yushan, et al.
Published: (2025)