:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ren, Xuhua, Shi, Hengcan, Li, Jin
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2403.07518
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CoT-PL: Chain-of-Thought Pseudo-Labeling for Open-Vocabulary Object Detection
by: Choi, Hojun, et al.
Published: (2025)

Data-Efficient Semantic Segmentation of 3D Point Clouds via Open-Vocabulary Image Segmentation-based Pseudo-Labeling
by: Furuya, Takahiko
Published: (2026)

Self-Prompting Diffusion Transformer for Open-Vocabulary Scene Text Editing via In-Context Learning
by: Li, Hongxi, et al.
Published: (2026)

DART: Dual Adaptive Refinement Transfer for Open-Vocabulary Multi-Label Recognition
by: Liu, Haijing, et al.
Published: (2025)

Beyond Reward Margin: Rethinking and Resolving Likelihood Displacement in Diffusion Models via Video Generation
by: Xu, Ruojun, et al.
Published: (2025)

GHOST: Grounded Human Motion Generation with Open Vocabulary Scene-and-Text Contexts
by: Milacski, Zoltán Á., et al.
Published: (2024)

Classifying the Unknown: In-Context Learning for Open-Vocabulary Text and Symbol Recognition
by: Simon, Tom, et al.
Published: (2025)

Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport
by: Tan, Hao, et al.
Published: (2025)

DifFUSER: Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation
by: Le, Duy-Tho, et al.
Published: (2024)

Exploring Open-Vocabulary Object Recognition in Images using CLIP
by: Chen, Wei Yu, et al.
Published: (2026)

LEGO: Self-Supervised Representation Learning for Scene Text Images
by: Ren, Yujin, et al.
Published: (2024)

Open-Vocabulary Domain Generalization in Urban-Scene Segmentation
by: Zhao, Dong, et al.
Published: (2026)

Open Vocabulary Semantic Scene Sketch Understanding
by: Bourouis, Ahmed, et al.
Published: (2023)

Open Vocabulary Multi-Label Video Classification
by: Gupta, Rohit, et al.
Published: (2024)

OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation
by: Jiang, Haochen, et al.
Published: (2024)

Category-Adaptive Cross-Modal Semantic Refinement and Transfer for Open-Vocabulary Multi-Label Recognition
by: Liu, Haijing, et al.
Published: (2024)

Open-Vocabulary Octree-Graph for 3D Scene Understanding
by: Wang, Zhigang, et al.
Published: (2024)

Open-Vocabulary SAM3D: Towards Training-free Open-Vocabulary 3D Scene Understanding
by: Tai, Hanchen, et al.
Published: (2024)

Open-Vocabulary Semantic Segmentation Network Integrating Object-Level Label and Scene-Level Semantic Features for Multimodal Remote Sensing Images
by: Dai, Jinkun, et al.
Published: (2026)

Incomplete Multi-Label Image Recognition by Co-learning Semantic-Aware Features and Label Recovery
by: He, Zhi-Fen, et al.
Published: (2025)

JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments
by: Le, Duy-Tho, et al.
Published: (2024)

ROVI: A VLM-LLM Re-Captioned Dataset for Open-Vocabulary Instance-Grounded Text-to-Image Generation
by: Peng, Cihang, et al.
Published: (2025)

Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
by: Wang, Pengfei, et al.
Published: (2024)

FOLK: Fast Open-Vocabulary 3D Instance Segmentation via Label-guided Knowledge Distillation
by: Wu, Hongrui, et al.
Published: (2025)

Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
by: Jiao, Siyu, et al.
Published: (2024)

Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes
by: Zhou, Changqing, et al.
Published: (2026)

Text-Region Matching for Multi-Label Image Recognition with Missing Labels
by: Ma, Leilei, et al.
Published: (2024)

MPT: Motion Prompt Tuning for Micro-Expression Recognition
by: Liu, Jiateng, et al.
Published: (2025)

Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition
by: Zhang, Yifei, et al.
Published: (2025)

EgoSplat: Open-Vocabulary Egocentric Scene Understanding with Language Embedded 3D Gaussian Splatting
by: Li, Di, et al.
Published: (2025)

Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
by: Li, Ruihuang, et al.
Published: (2024)

Interaction-Centric Knowledge Infusion and Transfer for Open-Vocabulary Scene Graph Generation
by: Li, Lin, et al.
Published: (2025)

Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
by: Zhu, Xiaoyu, et al.
Published: (2024)

RT-OVAD: Real-Time Open-Vocabulary Aerial Object Detection via Image-Text Collaboration
by: Wei, Guoting, et al.
Published: (2024)

Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels
by: Shin, Heeseong, et al.
Published: (2024)

USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation
by: Wang, Xiaoqi, et al.
Published: (2024)

OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies
by: Kong, Lingdong, et al.
Published: (2024)

DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
by: Cheng, Haozhe, et al.
Published: (2024)

OVMR: Open-Vocabulary Recognition with Multi-Modal References
by: Ma, Zehong, et al.
Published: (2024)

Fine-Grained Open-Vocabulary Object Recognition via User-Guided Segmentation
by: Ahn, Jinwoo, et al.
Published: (2024)