:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xiao, Qinfeng, Mei, Guofeng, Liu, Qilong, Yi, Chenyuan, Poiesi, Fabio, Zhang, Jian, Yang, Bo, Kit-lun, Yick
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.07652
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Universal 3D Shape Matching via Coarse-to-Fine Language Guidance
by: Xiao, Qinfeng, et al.
Published: (2026)

Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant
by: Mei, Guofeng, et al.
Published: (2024)

Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding
by: Mei, Guofeng, et al.
Published: (2023)

Multimodal Fusion SLAM with Fourier Attention
by: Zhou, Youjie, et al.
Published: (2025)

Fully-Geometric Cross-Attention for Point Cloud Registration
by: Wang, Weijie, et al.
Published: (2025)

PerLA: Perceptive 3D Language Assistant
by: Mei, Guofeng, et al.
Published: (2024)

Efficient Encoder-Free Fourier-based 3D Large Multimodal Model
by: Mei, Guofeng, et al.
Published: (2026)

Masked Clustering Prediction for Unsupervised Point Cloud Pre-training
by: Ren, Bin, et al.
Published: (2025)

ZeroReg: Zero-Shot Point Cloud Registration with Foundation Models
by: Wang, Weijie, et al.
Published: (2023)

Free-form language-based robotic reasoning and grasping
by: Jiao, Runyu, et al.
Published: (2025)

Obstruction reasoning for robotic grasping
by: Jiao, Runyu, et al.
Published: (2025)

Action-guided generation of 3D functionality segmentation data
by: Corsetti, Jaime, et al.
Published: (2025)

Self-Supervised and Generalizable Tokenization for CLIP-Based 3D Understanding
by: Mei, Guofeng, et al.
Published: (2025)

Accurate and efficient zero-shot 6D pose estimation with frozen foundation models
by: Caraffa, Andrea, et al.
Published: (2025)

$S^3$: Synonymous Semantic Space for Improving Zero-Shot Generalization of Vision-Language Models
by: Yin, Xiaojie, et al.
Published: (2024)

Leveraging Confident Image Regions for Source-Free Domain-Adaptive Object Detection
by: Mekhalfi, Mohamed Lamine, et al.
Published: (2025)

GSTran: Joint Geometric and Semantic Coherence for Point Cloud Segmentation
by: Li, Abiao, et al.
Published: (2024)

Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
by: Li, Jinlong, et al.
Published: (2025)

FreeZe: Training-free zero-shot 6D pose estimation with geometric and vision foundation models
by: Caraffa, Andrea, et al.
Published: (2023)

Distilling 3D distinctive local descriptors for 6D pose estimation
by: Hamza, Amir, et al.
Published: (2025)

Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models
by: Tur, Anil Osman, et al.
Published: (2024)

Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study
by: Zhu, Qinfeng, et al.
Published: (2024)

Graph Matching Optimization Network for Point Cloud Registration
by: Wu, Qianliang, et al.
Published: (2023)

Constrained Prompt Enhancement for Improving Zero-Shot Generalization of Vision-Language Models
by: Yin, Xiaojie, et al.
Published: (2025)

View-on-Graph: Zero-shot 3D Visual Grounding via Vision-Language Reasoning on Scene Graphs
by: Liu, Yuanyuan, et al.
Published: (2025)

SOCO: Benchmarking Semantic Object Correspondence in Vision Foundation Models
by: Dünkel, Olaf, et al.
Published: (2026)

Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform
by: Park, Chunghyun, et al.
Published: (2024)

Open-vocabulary object 6D pose estimation
by: Corsetti, Jaime, et al.
Published: (2023)

Functionality understanding and segmentation in 3D scenes
by: Corsetti, Jaime, et al.
Published: (2024)

An analysis of vision-language models for fabric retrieval
by: Giuliari, Francesco, et al.
Published: (2025)

Generative 6D Pose Estimation via Conditional Flow Matching
by: Hamza, Amir, et al.
Published: (2026)

Novel class discovery meets foundation models for 3D semantic segmentation
by: Riz, Luigi, et al.
Published: (2023)

OpenHype: Hyperbolic Embeddings for Hierarchical Open-Vocabulary Radiance Fields
by: Weijler, Lisa, et al.
Published: (2025)

On Unsupervised Partial Shape Correspondence
by: Bracha, Amit, et al.
Published: (2023)

Toward Semantic-Agnostic and Shape-Aware Vision-Language Segmentation Models
by: Seutin, Corentin, et al.
Published: (2026)

6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model
by: Bortolon, Matteo, et al.
Published: (2024)

Towards Cross-View Point Correspondence in Vision-Language Models
by: Wang, Yipu, et al.
Published: (2025)

Shape-of-You: Fused Gromov-Wasserstein Optimal Transport for Semantic Correspondence in-the-Wild
by: Im, Jiin, et al.
Published: (2026)

IFFNeRF: Initialisation Free and Fast 6DoF pose estimation from a single image and a NeRF model
by: Bortolon, Matteo, et al.
Published: (2024)

CAS-IQA: Teaching Vision-Language Models for Synthetic Angiography Quality Assessment
by: Wang, Bo, et al.
Published: (2025)