Saved in:
| Main Authors: | Yuan, Qihao, Li, Kailai, Zhang, Jiaming |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.14594 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
by: Li, Rong, et al.
Published: (2024)
by: Li, Rong, et al.
Published: (2024)
Zero-Shot 3D Visual Grounding from Vision-Language Models
by: Li, Rong, et al.
Published: (2025)
by: Li, Rong, et al.
Published: (2025)
Z3D: Zero-Shot 3D Visual Grounding from Images
by: Drozdov, Nikita, et al.
Published: (2026)
by: Drozdov, Nikita, et al.
Published: (2026)
Zero-Shot Visual Grounding in 3D Gaussians via View Retrieval
by: Liao, Liwei, et al.
Published: (2025)
by: Liao, Liwei, et al.
Published: (2025)
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
by: Yuan, Zhihao, et al.
Published: (2023)
by: Yuan, Zhihao, et al.
Published: (2023)
Seeing Beyond Classes: Zero-Shot Grounded Situation Recognition via Language Explainer
by: Lei, Jiaming, et al.
Published: (2024)
by: Lei, Jiaming, et al.
Published: (2024)
Multiple Consistent 2D-3D Mappings for Robust Zero-Shot 3D Visual Grounding
by: Yin, Yufei, et al.
Published: (2026)
by: Yin, Yufei, et al.
Published: (2026)
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
by: Xu, Runsen, et al.
Published: (2024)
by: Xu, Runsen, et al.
Published: (2024)
SceneGraphGrounder: Zero-Shot 3D Visual Grounding via Structured Scene Graph Matching
by: Sun, Xuefei, et al.
Published: (2026)
by: Sun, Xuefei, et al.
Published: (2026)
SeqVLM: Proposal-Guided Multi-View Sequences Reasoning via VLM for Zero-Shot 3D Visual Grounding
by: Lin, Jiawen, et al.
Published: (2025)
by: Lin, Jiawen, et al.
Published: (2025)
AgentGrounder: Zero-Shot 3D Visual Pointcloud Grounding using Multimodal Language Models
by: Huynh, Cuong, et al.
Published: (2026)
by: Huynh, Cuong, et al.
Published: (2026)
Grounding Descriptions in Images informs Zero-Shot Visual Recognition
by: Halbe, Shaunak, et al.
Published: (2024)
by: Halbe, Shaunak, et al.
Published: (2024)
3DWG: 3D Weakly Supervised Visual Grounding via Category and Instance-Level Alignment
by: Li, Xiaoqi, et al.
Published: (2025)
by: Li, Xiaoqi, et al.
Published: (2025)
pySpatial: Generating 3D Visual Programs for Zero-Shot Spatial Reasoning
by: Luo, Zhanpeng, et al.
Published: (2026)
by: Luo, Zhanpeng, et al.
Published: (2026)
Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding
by: Wang, Haibo, et al.
Published: (2026)
by: Wang, Haibo, et al.
Published: (2026)
ZONE: Zero-Shot Instruction-Guided Local Editing
by: Li, Shanglin, et al.
Published: (2023)
by: Li, Shanglin, et al.
Published: (2023)
ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
by: Min, Yunhong, et al.
Published: (2025)
by: Min, Yunhong, et al.
Published: (2025)
Distributed Zero-Shot Learning for Visual Recognition
by: Chen, Zhi, et al.
Published: (2025)
by: Chen, Zhi, et al.
Published: (2025)
RSVG-ZeroOV: Exploring a Training-Free Framework for Zero-Shot Open-Vocabulary Visual Grounding in Remote Sensing Images
by: Li, Ke, et al.
Published: (2025)
by: Li, Ke, et al.
Published: (2025)
Zero-P-to-3: Zero-Shot Partial-View Images to 3D Object
by: Lin, Yuxuan, et al.
Published: (2025)
by: Lin, Yuxuan, et al.
Published: (2025)
Attend and Enrich: Enhanced Visual Prompt for Zero-Shot Learning
by: Liu, Man, et al.
Published: (2024)
by: Liu, Man, et al.
Published: (2024)
Learning Visual Proxy for Compositional Zero-Shot Learning
by: Zhang, Shiyu, et al.
Published: (2025)
by: Zhang, Shiyu, et al.
Published: (2025)
$\text{H}^2$em: Learning Hierarchical Hyperbolic Embeddings for Compositional Zero-Shot Learning
by: Li, Lin, et al.
Published: (2025)
by: Li, Lin, et al.
Published: (2025)
ZeroMamba: Exploring Visual State Space Model for Zero-Shot Learning
by: Hou, Wenjin, et al.
Published: (2024)
by: Hou, Wenjin, et al.
Published: (2024)
Weakly-Supervised 3D Visual Grounding based on Visual Language Alignment
by: Xu, Xiaoxu, et al.
Published: (2023)
by: Xu, Xiaoxu, et al.
Published: (2023)
SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding
by: Jin, Zhao, et al.
Published: (2025)
by: Jin, Zhao, et al.
Published: (2025)
Visual Grounding with Attention-Driven Constraint Balancing
by: Kang, Weitai, et al.
Published: (2024)
by: Kang, Weitai, et al.
Published: (2024)
SVIP: Semantically Contextualized Visual Patches for Zero-Shot Learning
by: Chen, Zhi, et al.
Published: (2025)
by: Chen, Zhi, et al.
Published: (2025)
Neuro-3D: Towards 3D Visual Decoding from EEG Signals
by: Guo, Zhanqiang, et al.
Published: (2024)
by: Guo, Zhanqiang, et al.
Published: (2024)
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data
by: Liu, Qihao, et al.
Published: (2024)
by: Liu, Qihao, et al.
Published: (2024)
OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging
by: Tang, Yijie, et al.
Published: (2025)
by: Tang, Yijie, et al.
Published: (2025)
Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning
by: Jiang, Huajie, et al.
Published: (2025)
by: Jiang, Huajie, et al.
Published: (2025)
Fine-Grained Zero-Shot Composed Image Retrieval with Complementary Visual-Semantic Integration
by: Ye, Yongcong, et al.
Published: (2026)
by: Ye, Yongcong, et al.
Published: (2026)
RefChartQA: Grounding Visual Answer on Chart Images through Instruction Tuning
by: Vogel, Alexander, et al.
Published: (2025)
by: Vogel, Alexander, et al.
Published: (2025)
Open-Pose 3D Zero-Shot Learning: Benchmark and Challenges
by: Zhao, Weiguang, et al.
Published: (2023)
by: Zhao, Weiguang, et al.
Published: (2023)
ChangingGrounding: 3D Visual Grounding in Changing Scenes
by: Hu, Miao, et al.
Published: (2025)
by: Hu, Miao, et al.
Published: (2025)
StoryTailor:A Zero-Shot Pipeline for Action-Rich Multi-Subject Visual Narratives
by: Hu, Jinghao, et al.
Published: (2026)
by: Hu, Jinghao, et al.
Published: (2026)
ReplicateAnyScene: Zero-Shot Video-to-3D Composition via Textual-Visual-Spatial Alignment
by: Dong, Mingyu, et al.
Published: (2026)
by: Dong, Mingyu, et al.
Published: (2026)
SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models
by: Zantout, Nader, et al.
Published: (2025)
by: Zantout, Nader, et al.
Published: (2025)
Zero-Shot 4D Lidar Panoptic Segmentation
by: Zhang, Yushan, et al.
Published: (2025)
by: Zhang, Yushan, et al.
Published: (2025)
Similar Items
-
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
by: Li, Rong, et al.
Published: (2024) -
Zero-Shot 3D Visual Grounding from Vision-Language Models
by: Li, Rong, et al.
Published: (2025) -
Z3D: Zero-Shot 3D Visual Grounding from Images
by: Drozdov, Nikita, et al.
Published: (2026) -
Zero-Shot Visual Grounding in 3D Gaussians via View Retrieval
by: Liao, Liwei, et al.
Published: (2025) -
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
by: Yuan, Zhihao, et al.
Published: (2023)