Saved in:
| Main Authors: | Hu, Junyi, Bai, Tian, Wu, Fengyi, Li, Wenyan, Peng, Zhenming, Zhang, Yi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.22666 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
P$^2$HCT: Plug-and-Play Hierarchical C2F Transformer for Multi-Scale Feature Fusion
by: Hu, Junyi, et al.
Published: (2025)
by: Hu, Junyi, et al.
Published: (2025)
GALA: Guided Attention with Language Alignment for Open Vocabulary Gaussian Splatting
by: Alegret, Elena, et al.
Published: (2025)
by: Alegret, Elena, et al.
Published: (2025)
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
by: Fang, Hao, et al.
Published: (2024)
by: Fang, Hao, et al.
Published: (2024)
Prototype-Aware Multimodal Alignment for Open-Vocabulary Visual Grounding
by: Xie, Jiangnan, et al.
Published: (2025)
by: Xie, Jiangnan, et al.
Published: (2025)
Decomposed Vision-Language Alignment for Fine-Grained Open-Vocabulary Segmentation
by: Wang, Chenhao, et al.
Published: (2026)
by: Wang, Chenhao, et al.
Published: (2026)
Neural Spatial-Temporal Tensor Representation for Infrared Small Target Detection
by: Wu, Fengyi, et al.
Published: (2024)
by: Wu, Fengyi, et al.
Published: (2024)
InfoCLIP: Bridging Vision-Language Pretraining and Open-Vocabulary Semantic Segmentation via Information-Theoretic Alignment Transfer
by: Yuan, Muyao, et al.
Published: (2025)
by: Yuan, Muyao, et al.
Published: (2025)
Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction
by: Li, Yunheng, et al.
Published: (2024)
by: Li, Yunheng, et al.
Published: (2024)
RPCANet++: Deep Interpretable Robust PCA for Sparse Object Segmentation
by: Wu, Fengyi, et al.
Published: (2025)
by: Wu, Fengyi, et al.
Published: (2025)
LAGO: Language-Guided Adaptive Object-Region Focus for Zero-Shot Visual-Text Alignment
by: Hu, Junyi, et al.
Published: (2026)
by: Hu, Junyi, et al.
Published: (2026)
Open-Vocabulary Video Anomaly Detection
by: Wu, Peng, et al.
Published: (2023)
by: Wu, Peng, et al.
Published: (2023)
OVS-DINO: Open-Vocabulary Segmentation via Structure-Aligned SAM-DINO with Language Guidance
by: Zeng, Haoxi, et al.
Published: (2026)
by: Zeng, Haoxi, et al.
Published: (2026)
LangHOPS: Language Grounded Hierarchical Open-Vocabulary Part Segmentation
by: Miao, Yang, et al.
Published: (2025)
by: Miao, Yang, et al.
Published: (2025)
Open-Vocabulary Object Detection via Neighboring Region Attention Alignment
by: Qiang, Sunyuan, et al.
Published: (2024)
by: Qiang, Sunyuan, et al.
Published: (2024)
Part-Aware Open-Vocabulary 3D Affordance Grounding via Prototypical Semantic and Geometric Alignment
by: Gou, Dongqiang, et al.
Published: (2026)
by: Gou, Dongqiang, et al.
Published: (2026)
Bilateral Collaboration with Large Vision-Language Models for Open Vocabulary Human-Object Interaction Detection
by: Hu, Yupeng, et al.
Published: (2025)
by: Hu, Yupeng, et al.
Published: (2025)
MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models
by: Xia, Yinan, et al.
Published: (2025)
by: Xia, Yinan, et al.
Published: (2025)
DRPCA-Net: Make Robust PCA Great Again for Infrared Small Target Detection
by: Xiong, Zihao, et al.
Published: (2025)
by: Xiong, Zihao, et al.
Published: (2025)
Thermal-Det: Language-Guided Cross-Modal Distillation for Open-Vocabulary Thermal Object Detection
by: Ranasinghe, Yasiru, et al.
Published: (2026)
by: Ranasinghe, Yasiru, et al.
Published: (2026)
Open-Vocabulary Camouflaged Object Segmentation with Cascaded Vision Language Models
by: Zhao, Kai, et al.
Published: (2025)
by: Zhao, Kai, et al.
Published: (2025)
Beyond-Labels: Advancing Open-Vocabulary Segmentation With Vision-Language Models
by: Rahman, Muhammad Atta ur, et al.
Published: (2025)
by: Rahman, Muhammad Atta ur, et al.
Published: (2025)
Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
by: Li, Ruihuang, et al.
Published: (2024)
by: Li, Ruihuang, et al.
Published: (2024)
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
by: Wu, Size, et al.
Published: (2023)
by: Wu, Size, et al.
Published: (2023)
Lost in Translation? Vocabulary Alignment for Source-Free Adaptation in Open-Vocabulary Semantic Segmentation
by: Mazzucco, Silvio, et al.
Published: (2025)
by: Mazzucco, Silvio, et al.
Published: (2025)
ComAlign: Compositional Alignment in Vision-Language Models
by: Abdollah, Ali, et al.
Published: (2024)
by: Abdollah, Ali, et al.
Published: (2024)
Modest-Align: Data-Efficient Alignment for Vision-Language Models
by: Liu, Jiaxiang, et al.
Published: (2025)
by: Liu, Jiaxiang, et al.
Published: (2025)
From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models
by: Li, Rongjie, et al.
Published: (2024)
by: Li, Rongjie, et al.
Published: (2024)
World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models
by: Ma, Ziqiao, et al.
Published: (2023)
by: Ma, Ziqiao, et al.
Published: (2023)
Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding
by: Wasim, Syed Talal, et al.
Published: (2023)
by: Wasim, Syed Talal, et al.
Published: (2023)
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
by: Kang, Dahyun, et al.
Published: (2024)
by: Kang, Dahyun, et al.
Published: (2024)
ExpPortrait: Expressive Portrait Generation via Personalized Representation
by: Wang, Junyi, et al.
Published: (2026)
by: Wang, Junyi, et al.
Published: (2026)
Exploring Vision-Language Models for Open-Vocabulary Zero-Shot Action Segmentation
by: Unmesh, Asim, et al.
Published: (2026)
by: Unmesh, Asim, et al.
Published: (2026)
Test-Time Adaptation of Vision-Language Models for Open-Vocabulary Semantic Segmentation
by: Noori, Mehrdad, et al.
Published: (2025)
by: Noori, Mehrdad, et al.
Published: (2025)
Adapting Vision-Language Model with Fine-grained Semantics for Open-Vocabulary Segmentation
by: Chng, Yong Xien, et al.
Published: (2024)
by: Chng, Yong Xien, et al.
Published: (2024)
Leveraging Vision-Language Models for Open-Vocabulary Instance Segmentation and Tracking
by: Pätzold, Bastian, et al.
Published: (2025)
by: Pätzold, Bastian, et al.
Published: (2025)
AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models
by: Wu, Yuhang, et al.
Published: (2024)
by: Wu, Yuhang, et al.
Published: (2024)
Renovating Names in Open-Vocabulary Segmentation Benchmarks
by: Huang, Haiwen, et al.
Published: (2024)
by: Huang, Haiwen, et al.
Published: (2024)
Enhancing Open-Vocabulary Object Detection through Multi-Level Fine-Grained Visual-Language Alignment
by: Zhang, Tianyi, et al.
Published: (2026)
by: Zhang, Tianyi, et al.
Published: (2026)
Semantic Alignment in Hyperbolic Space for Open-Vocabulary Semantic Segmentation
by: Truong, Hoang M., et al.
Published: (2026)
by: Truong, Hoang M., et al.
Published: (2026)
Denoise and Align: Diffusion-Driven Foreground Knowledge Prompting for Open-Vocabulary Temporal Action Detection
by: Zhu, Sa, et al.
Published: (2026)
by: Zhu, Sa, et al.
Published: (2026)
Similar Items
-
P$^2$HCT: Plug-and-Play Hierarchical C2F Transformer for Multi-Scale Feature Fusion
by: Hu, Junyi, et al.
Published: (2025) -
GALA: Guided Attention with Language Alignment for Open Vocabulary Gaussian Splatting
by: Alegret, Elena, et al.
Published: (2025) -
Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
by: Fang, Hao, et al.
Published: (2024) -
Prototype-Aware Multimodal Alignment for Open-Vocabulary Visual Grounding
by: Xie, Jiangnan, et al.
Published: (2025) -
Decomposed Vision-Language Alignment for Fine-Grained Open-Vocabulary Segmentation
by: Wang, Chenhao, et al.
Published: (2026)