Saved in:
| Main Authors: | Ren, Xuhua, Shi, Hengcan, Li, Jin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.07518 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CoT-PL: Chain-of-Thought Pseudo-Labeling for Open-Vocabulary Object Detection
by: Choi, Hojun, et al.
Published: (2025)
by: Choi, Hojun, et al.
Published: (2025)
Data-Efficient Semantic Segmentation of 3D Point Clouds via Open-Vocabulary Image Segmentation-based Pseudo-Labeling
by: Furuya, Takahiko
Published: (2026)
by: Furuya, Takahiko
Published: (2026)
Self-Prompting Diffusion Transformer for Open-Vocabulary Scene Text Editing via In-Context Learning
by: Li, Hongxi, et al.
Published: (2026)
by: Li, Hongxi, et al.
Published: (2026)
DART: Dual Adaptive Refinement Transfer for Open-Vocabulary Multi-Label Recognition
by: Liu, Haijing, et al.
Published: (2025)
by: Liu, Haijing, et al.
Published: (2025)
Beyond Reward Margin: Rethinking and Resolving Likelihood Displacement in Diffusion Models via Video Generation
by: Xu, Ruojun, et al.
Published: (2025)
by: Xu, Ruojun, et al.
Published: (2025)
GHOST: Grounded Human Motion Generation with Open Vocabulary Scene-and-Text Contexts
by: Milacski, Zoltán Á., et al.
Published: (2024)
by: Milacski, Zoltán Á., et al.
Published: (2024)
Classifying the Unknown: In-Context Learning for Open-Vocabulary Text and Symbol Recognition
by: Simon, Tom, et al.
Published: (2025)
by: Simon, Tom, et al.
Published: (2025)
Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport
by: Tan, Hao, et al.
Published: (2025)
by: Tan, Hao, et al.
Published: (2025)
DifFUSER: Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation
by: Le, Duy-Tho, et al.
Published: (2024)
by: Le, Duy-Tho, et al.
Published: (2024)
Exploring Open-Vocabulary Object Recognition in Images using CLIP
by: Chen, Wei Yu, et al.
Published: (2026)
by: Chen, Wei Yu, et al.
Published: (2026)
LEGO: Self-Supervised Representation Learning for Scene Text Images
by: Ren, Yujin, et al.
Published: (2024)
by: Ren, Yujin, et al.
Published: (2024)
Open-Vocabulary Domain Generalization in Urban-Scene Segmentation
by: Zhao, Dong, et al.
Published: (2026)
by: Zhao, Dong, et al.
Published: (2026)
Open Vocabulary Semantic Scene Sketch Understanding
by: Bourouis, Ahmed, et al.
Published: (2023)
by: Bourouis, Ahmed, et al.
Published: (2023)
Open Vocabulary Multi-Label Video Classification
by: Gupta, Rohit, et al.
Published: (2024)
by: Gupta, Rohit, et al.
Published: (2024)
OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation
by: Jiang, Haochen, et al.
Published: (2024)
by: Jiang, Haochen, et al.
Published: (2024)
Category-Adaptive Cross-Modal Semantic Refinement and Transfer for Open-Vocabulary Multi-Label Recognition
by: Liu, Haijing, et al.
Published: (2024)
by: Liu, Haijing, et al.
Published: (2024)
Open-Vocabulary Octree-Graph for 3D Scene Understanding
by: Wang, Zhigang, et al.
Published: (2024)
by: Wang, Zhigang, et al.
Published: (2024)
Open-Vocabulary SAM3D: Towards Training-free Open-Vocabulary 3D Scene Understanding
by: Tai, Hanchen, et al.
Published: (2024)
by: Tai, Hanchen, et al.
Published: (2024)
Open-Vocabulary Semantic Segmentation Network Integrating Object-Level Label and Scene-Level Semantic Features for Multimodal Remote Sensing Images
by: Dai, Jinkun, et al.
Published: (2026)
by: Dai, Jinkun, et al.
Published: (2026)
Incomplete Multi-Label Image Recognition by Co-learning Semantic-Aware Features and Label Recovery
by: He, Zhi-Fen, et al.
Published: (2025)
by: He, Zhi-Fen, et al.
Published: (2025)
JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments
by: Le, Duy-Tho, et al.
Published: (2024)
by: Le, Duy-Tho, et al.
Published: (2024)
ROVI: A VLM-LLM Re-Captioned Dataset for Open-Vocabulary Instance-Grounded Text-to-Image Generation
by: Peng, Cihang, et al.
Published: (2025)
by: Peng, Cihang, et al.
Published: (2025)
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
by: Wang, Pengfei, et al.
Published: (2024)
by: Wang, Pengfei, et al.
Published: (2024)
FOLK: Fast Open-Vocabulary 3D Instance Segmentation via Label-guided Knowledge Distillation
by: Wu, Hongrui, et al.
Published: (2025)
by: Wu, Hongrui, et al.
Published: (2025)
Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
by: Jiao, Siyu, et al.
Published: (2024)
by: Jiao, Siyu, et al.
Published: (2024)
Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes
by: Zhou, Changqing, et al.
Published: (2026)
by: Zhou, Changqing, et al.
Published: (2026)
Text-Region Matching for Multi-Label Image Recognition with Missing Labels
by: Ma, Leilei, et al.
Published: (2024)
by: Ma, Leilei, et al.
Published: (2024)
MPT: Motion Prompt Tuning for Micro-Expression Recognition
by: Liu, Jiateng, et al.
Published: (2025)
by: Liu, Jiateng, et al.
Published: (2025)
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition
by: Zhang, Yifei, et al.
Published: (2025)
by: Zhang, Yifei, et al.
Published: (2025)
EgoSplat: Open-Vocabulary Egocentric Scene Understanding with Language Embedded 3D Gaussian Splatting
by: Li, Di, et al.
Published: (2025)
by: Li, Di, et al.
Published: (2025)
Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
by: Li, Ruihuang, et al.
Published: (2024)
by: Li, Ruihuang, et al.
Published: (2024)
Interaction-Centric Knowledge Infusion and Transfer for Open-Vocabulary Scene Graph Generation
by: Li, Lin, et al.
Published: (2025)
by: Li, Lin, et al.
Published: (2025)
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
by: Zhu, Xiaoyu, et al.
Published: (2024)
by: Zhu, Xiaoyu, et al.
Published: (2024)
RT-OVAD: Real-Time Open-Vocabulary Aerial Object Detection via Image-Text Collaboration
by: Wei, Guoting, et al.
Published: (2024)
by: Wei, Guoting, et al.
Published: (2024)
Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels
by: Shin, Heeseong, et al.
Published: (2024)
by: Shin, Heeseong, et al.
Published: (2024)
USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation
by: Wang, Xiaoqi, et al.
Published: (2024)
by: Wang, Xiaoqi, et al.
Published: (2024)
OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies
by: Kong, Lingdong, et al.
Published: (2024)
by: Kong, Lingdong, et al.
Published: (2024)
DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition
by: Cheng, Haozhe, et al.
Published: (2024)
by: Cheng, Haozhe, et al.
Published: (2024)
OVMR: Open-Vocabulary Recognition with Multi-Modal References
by: Ma, Zehong, et al.
Published: (2024)
by: Ma, Zehong, et al.
Published: (2024)
Fine-Grained Open-Vocabulary Object Recognition via User-Guided Segmentation
by: Ahn, Jinwoo, et al.
Published: (2024)
by: Ahn, Jinwoo, et al.
Published: (2024)
Similar Items
-
CoT-PL: Chain-of-Thought Pseudo-Labeling for Open-Vocabulary Object Detection
by: Choi, Hojun, et al.
Published: (2025) -
Data-Efficient Semantic Segmentation of 3D Point Clouds via Open-Vocabulary Image Segmentation-based Pseudo-Labeling
by: Furuya, Takahiko
Published: (2026) -
Self-Prompting Diffusion Transformer for Open-Vocabulary Scene Text Editing via In-Context Learning
by: Li, Hongxi, et al.
Published: (2026) -
DART: Dual Adaptive Refinement Transfer for Open-Vocabulary Multi-Label Recognition
by: Liu, Haijing, et al.
Published: (2025) -
Beyond Reward Margin: Rethinking and Resolving Likelihood Displacement in Diffusion Models via Video Generation
by: Xu, Ruojun, et al.
Published: (2025)