Saved in:
| Main Authors: | Wang, Zilin, Yu, Stella X. |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.19410 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Open Ad-hoc Categorization with Contextualized Feature Learning
by: Wang, Zilin, et al.
Published: (2025)
by: Wang, Zilin, et al.
Published: (2025)
Free-Grained Hierarchical Visual Recognition
by: Park, Seulki, et al.
Published: (2025)
by: Park, Seulki, et al.
Published: (2025)
Normalize Filters! Classical Wisdom for Deep Vision
by: Perez, Gustavo, et al.
Published: (2025)
by: Perez, Gustavo, et al.
Published: (2025)
Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation
by: Shi, Yuheng, et al.
Published: (2024)
by: Shi, Yuheng, et al.
Published: (2024)
Interpretable Embedding for Ad-hoc Video Search
by: Wu, Jiaxin, et al.
Published: (2024)
by: Wu, Jiaxin, et al.
Published: (2024)
SHED Light on Segmentation for Dense Prediction
by: Lee, Seung Hyun, et al.
Published: (2026)
by: Lee, Seung Hyun, et al.
Published: (2026)
Aligning Forest and Trees in Images & Long Captions for Visually Grounded Understanding
by: Woo, Byeongju, et al.
Published: (2026)
by: Woo, Byeongju, et al.
Published: (2026)
Poster: Reliable 3D Reconstruction for Ad-hoc Edge Implementations
by: Absur, Md Nurul, et al.
Published: (2024)
by: Absur, Md Nurul, et al.
Published: (2024)
A Unified Framework for Event-based Frame Interpolation with Ad-hoc Deblurring in the Wild
by: Sun, Lei, et al.
Published: (2023)
by: Sun, Lei, et al.
Published: (2023)
Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation
by: Wei, Zhixiang, et al.
Published: (2023)
by: Wei, Zhixiang, et al.
Published: (2023)
Harnessing Vision-Language Pretrained Models with Temporal-Aware Adaptation for Referring Video Object Segmentation
by: Zhou, Zikun, et al.
Published: (2024)
by: Zhou, Zikun, et al.
Published: (2024)
Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation
by: Wu, Dongyue, et al.
Published: (2024)
by: Wu, Dongyue, et al.
Published: (2024)
Improving Interpretable Embeddings for Ad-hoc Video Search with Generative Captions and Multi-word Concept Bank
by: Wu, Jiaxin, et al.
Published: (2024)
by: Wu, Jiaxin, et al.
Published: (2024)
Learning Partially-Decorrelated Common Spaces for Ad-hoc Video Search
by: Hu, Fan, et al.
Published: (2025)
by: Hu, Fan, et al.
Published: (2025)
Decomposed Vision-Language Alignment for Fine-Grained Open-Vocabulary Segmentation
by: Wang, Chenhao, et al.
Published: (2026)
by: Wang, Chenhao, et al.
Published: (2026)
CoCo-SAM3: Harnessing Concept Conflict in Open-Vocabulary Semantic Segmentation
by: Chen, Yanhui, et al.
Published: (2026)
by: Chen, Yanhui, et al.
Published: (2026)
Post-hoc Probabilistic Vision-Language Models
by: Baumann, Anton, et al.
Published: (2024)
by: Baumann, Anton, et al.
Published: (2024)
Novel View Synthesis from A Few Glimpses via Test-Time Natural Video Completion
by: Xu, Yan, et al.
Published: (2025)
by: Xu, Yan, et al.
Published: (2025)
Open-Vocabulary Camouflaged Object Segmentation with Cascaded Vision Language Models
by: Zhao, Kai, et al.
Published: (2025)
by: Zhao, Kai, et al.
Published: (2025)
FairCLIP: Harnessing Fairness in Vision-Language Learning
by: Luo, Yan, et al.
Published: (2024)
by: Luo, Yan, et al.
Published: (2024)
AerOSeg: Harnessing SAM for Open-Vocabulary Segmentation in Remote Sensing Images
by: Dutta, Saikat, et al.
Published: (2025)
by: Dutta, Saikat, et al.
Published: (2025)
Selective, Regularized, and Calibrated: Harnessing Vision Foundation Models for Cross-Domain Few-Shot Semantic Segmentation
by: Ma, Junyuan, et al.
Published: (2026)
by: Ma, Junyuan, et al.
Published: (2026)
Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
by: Jiao, Siyu, et al.
Published: (2024)
by: Jiao, Siyu, et al.
Published: (2024)
Next-Embedding Prediction Makes Strong Vision Learners
by: Xu, Sihan, et al.
Published: (2025)
by: Xu, Sihan, et al.
Published: (2025)
Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer
by: Wu, Junyi, et al.
Published: (2024)
by: Wu, Junyi, et al.
Published: (2024)
TSegAgent: Zero-Shot Tooth Segmentation via Geometry-Aware Vision-Language Agents
by: Zhuang, Shaojie, et al.
Published: (2026)
by: Zhuang, Shaojie, et al.
Published: (2026)
Pose-Aware Self-Supervised Learning with Viewpoint Trajectory Regularization
by: Wang, Jiayun, et al.
Published: (2024)
by: Wang, Jiayun, et al.
Published: (2024)
Novel Category Discovery with X-Agent Attention for Open-Vocabulary Semantic Segmentation
by: Li, Jiahao, et al.
Published: (2025)
by: Li, Jiahao, et al.
Published: (2025)
Beyond-Labels: Advancing Open-Vocabulary Segmentation With Vision-Language Models
by: Rahman, Muhammad Atta ur, et al.
Published: (2025)
by: Rahman, Muhammad Atta ur, et al.
Published: (2025)
Exploring Vision-Language Models for Open-Vocabulary Zero-Shot Action Segmentation
by: Unmesh, Asim, et al.
Published: (2026)
by: Unmesh, Asim, et al.
Published: (2026)
Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation
by: Noori, Mehrdad, et al.
Published: (2025)
by: Noori, Mehrdad, et al.
Published: (2025)
Adapting Vision-Language Model with Fine-grained Semantics for Open-Vocabulary Segmentation
by: Chng, Yong Xien, et al.
Published: (2024)
by: Chng, Yong Xien, et al.
Published: (2024)
Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty
by: Mossina, Luca, et al.
Published: (2024)
by: Mossina, Luca, et al.
Published: (2024)
GeoSANE: Learning Geospatial Representations from Models, Not Data
by: Hanna, Joelle, et al.
Published: (2026)
by: Hanna, Joelle, et al.
Published: (2026)
REALM: An MLLM-Agent Framework for Open World 3D Reasoning Segmentation and Editing on Gaussian Splatting
by: Shi, Changyue, et al.
Published: (2025)
by: Shi, Changyue, et al.
Published: (2025)
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion
by: Liu, Yuan, et al.
Published: (2025)
by: Liu, Yuan, et al.
Published: (2025)
Leveraging Vision-Language Models for Open-Vocabulary Instance Segmentation and Tracking
by: Pätzold, Bastian, et al.
Published: (2025)
by: Pätzold, Bastian, et al.
Published: (2025)
Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments
by: Yu, Meng, et al.
Published: (2024)
by: Yu, Meng, et al.
Published: (2024)
Test-time Contrastive Concepts for Open-world Semantic Segmentation with Vision-Language Models
by: Wysoczańska, Monika, et al.
Published: (2024)
by: Wysoczańska, Monika, et al.
Published: (2024)
Segment then Splat: Unified 3D Open-Vocabulary Segmentation via Gaussian Splatting
by: Lu, Yiren, et al.
Published: (2025)
by: Lu, Yiren, et al.
Published: (2025)
Similar Items
-
Open Ad-hoc Categorization with Contextualized Feature Learning
by: Wang, Zilin, et al.
Published: (2025) -
Free-Grained Hierarchical Visual Recognition
by: Park, Seulki, et al.
Published: (2025) -
Normalize Filters! Classical Wisdom for Deep Vision
by: Perez, Gustavo, et al.
Published: (2025) -
Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation
by: Shi, Yuheng, et al.
Published: (2024) -
Interpretable Embedding for Ad-hoc Video Search
by: Wu, Jiaxin, et al.
Published: (2024)