:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Zilin, Yu, Stella X.
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2605.19410
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Open Ad-hoc Categorization with Contextualized Feature Learning
by: Wang, Zilin, et al.
Published: (2025)

Free-Grained Hierarchical Visual Recognition
by: Park, Seulki, et al.
Published: (2025)

Normalize Filters! Classical Wisdom for Deep Vision
by: Perez, Gustavo, et al.
Published: (2025)

Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation
by: Shi, Yuheng, et al.
Published: (2024)

Interpretable Embedding for Ad-hoc Video Search
by: Wu, Jiaxin, et al.
Published: (2024)

SHED Light on Segmentation for Dense Prediction
by: Lee, Seung Hyun, et al.
Published: (2026)

Aligning Forest and Trees in Images & Long Captions for Visually Grounded Understanding
by: Woo, Byeongju, et al.
Published: (2026)

Poster: Reliable 3D Reconstruction for Ad-hoc Edge Implementations
by: Absur, Md Nurul, et al.
Published: (2024)

A Unified Framework for Event-based Frame Interpolation with Ad-hoc Deblurring in the Wild
by: Sun, Lei, et al.
Published: (2023)

Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation
by: Wei, Zhixiang, et al.
Published: (2023)

Harnessing Vision-Language Pretrained Models with Temporal-Aware Adaptation for Referring Video Object Segmentation
by: Zhou, Zikun, et al.
Published: (2024)

Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation
by: Wu, Dongyue, et al.
Published: (2024)

Improving Interpretable Embeddings for Ad-hoc Video Search with Generative Captions and Multi-word Concept Bank
by: Wu, Jiaxin, et al.
Published: (2024)

Learning Partially-Decorrelated Common Spaces for Ad-hoc Video Search
by: Hu, Fan, et al.
Published: (2025)

Decomposed Vision-Language Alignment for Fine-Grained Open-Vocabulary Segmentation
by: Wang, Chenhao, et al.
Published: (2026)

CoCo-SAM3: Harnessing Concept Conflict in Open-Vocabulary Semantic Segmentation
by: Chen, Yanhui, et al.
Published: (2026)

Post-hoc Probabilistic Vision-Language Models
by: Baumann, Anton, et al.
Published: (2024)

Novel View Synthesis from A Few Glimpses via Test-Time Natural Video Completion
by: Xu, Yan, et al.
Published: (2025)

Open-Vocabulary Camouflaged Object Segmentation with Cascaded Vision Language Models
by: Zhao, Kai, et al.
Published: (2025)

FairCLIP: Harnessing Fairness in Vision-Language Learning
by: Luo, Yan, et al.
Published: (2024)

AerOSeg: Harnessing SAM for Open-Vocabulary Segmentation in Remote Sensing Images
by: Dutta, Saikat, et al.
Published: (2025)

Selective, Regularized, and Calibrated: Harnessing Vision Foundation Models for Cross-Domain Few-Shot Semantic Segmentation
by: Ma, Junyuan, et al.
Published: (2026)

Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
by: Jiao, Siyu, et al.
Published: (2024)

Next-Embedding Prediction Makes Strong Vision Learners
by: Xu, Sihan, et al.
Published: (2025)

Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer
by: Wu, Junyi, et al.
Published: (2024)

TSegAgent: Zero-Shot Tooth Segmentation via Geometry-Aware Vision-Language Agents
by: Zhuang, Shaojie, et al.
Published: (2026)

Pose-Aware Self-Supervised Learning with Viewpoint Trajectory Regularization
by: Wang, Jiayun, et al.
Published: (2024)

Novel Category Discovery with X-Agent Attention for Open-Vocabulary Semantic Segmentation
by: Li, Jiahao, et al.
Published: (2025)

Beyond-Labels: Advancing Open-Vocabulary Segmentation With Vision-Language Models
by: Rahman, Muhammad Atta ur, et al.
Published: (2025)

Exploring Vision-Language Models for Open-Vocabulary Zero-Shot Action Segmentation
by: Unmesh, Asim, et al.
Published: (2026)

Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation
by: Noori, Mehrdad, et al.
Published: (2025)

Adapting Vision-Language Model with Fine-grained Semantics for Open-Vocabulary Segmentation
by: Chng, Yong Xien, et al.
Published: (2024)

Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty
by: Mossina, Luca, et al.
Published: (2024)

GeoSANE: Learning Geospatial Representations from Models, Not Data
by: Hanna, Joelle, et al.
Published: (2026)

REALM: An MLLM-Agent Framework for Open World 3D Reasoning Segmentation and Editing on Gaussian Splatting
by: Shi, Changyue, et al.
Published: (2025)

POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion
by: Liu, Yuan, et al.
Published: (2025)

Leveraging Vision-Language Models for Open-Vocabulary Instance Segmentation and Tracking
by: Pätzold, Bastian, et al.
Published: (2025)

Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments
by: Yu, Meng, et al.
Published: (2024)

Test-time Contrastive Concepts for Open-world Semantic Segmentation with Vision-Language Models
by: Wysoczańska, Monika, et al.
Published: (2024)

Segment then Splat: Unified 3D Open-Vocabulary Segmentation via Gaussian Splatting
by: Lu, Yiren, et al.
Published: (2025)