:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Yan, Guo, Weiwei, Yang, Xue, Liao, Ning, Zhang, Shaofeng, Yu, Yi, Yu, Wenxian, Yan, Junchi
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2411.02057
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning
by: Li, Yan, et al.
Published: (2023)

FineRMoE: Dimension Expansion for Finer-Grained Expert with Its Upcycling Approach
by: Liao, Ning, et al.
Published: (2026)

ARS-DETR: Aspect Ratio-Sensitive Detection Transformer for Aerial Oriented Object Detection
by: Zeng, Ying, et al.
Published: (2023)

On the Evaluation and Refinement of Vision-Language Instruction Tuning Datasets
by: Liao, Ning, et al.
Published: (2023)

Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
by: Yu, Yi, et al.
Published: (2025)

Cross-View Open-Vocabulary Object Detection in Aerial Imagery
by: Kini, Jyoti, et al.
Published: (2025)

EvoTok: A Unified Image Tokenizer via Residual Latent Evolution for Visual Understanding and Generation
by: Li, Yan, et al.
Published: (2026)

RT-OVAD: Real-Time Open-Vocabulary Aerial Object Detection via Image-Text Collaboration
by: Wei, Guoting, et al.
Published: (2024)

Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection
by: Yu, Yi, et al.
Published: (2025)

M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios
by: Liao, Ning, et al.
Published: (2023)

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
by: Zhang, Xiangdong, et al.
Published: (2024)

Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views
by: Zhang, Xiangdong, et al.
Published: (2025)

Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
by: Bao, Wentao, et al.
Published: (2024)

Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision
by: Yu, Yi, et al.
Published: (2023)

Collaborative Learning for Unsupervised Multimodal Remote Sensing Image Registration: Integrating Self-Supervision and MIM-Guided Diffusion-Based Image Translation
by: Wei, Xiaochen, et al.
Published: (2025)

Streamlined Open-Vocabulary Human-Object Interaction Detection
by: Sun, Chang, et al.
Published: (2026)

Open-Vocabulary Object Detection in UAV Imagery: A Review and Future Perspectives
by: Zhou, Yang, et al.
Published: (2025)

Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection
by: Lei, Ting, et al.
Published: (2024)

OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
by: Li, Jinyang, et al.
Published: (2025)

VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models
by: Zhang, Xiangdong, et al.
Published: (2025)

Learning Adaptive and Temporally Causal Video Tokenization in a 1D Latent Space
by: Li, Yan, et al.
Published: (2025)

VK-Det: Visual Knowledge Guided Prototype Learning for Open-Vocabulary Aerial Object Detection
by: Yao, Jianhang, et al.
Published: (2025)

Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration
by: Lei, Ting, et al.
Published: (2025)

A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection
by: Fu, Shenghao, et al.
Published: (2025)

Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
by: Huang, Rui, et al.
Published: (2024)

Grid-Reg: Detector-Free Gridized Feature Learning and Matching for Large-Scale SAR-Optical Image Registration
by: Wei, Xiaochen, et al.
Published: (2025)

NITP: Next Implicit Token Prediction for LLM Pre-training
by: Zhang, Xiangdong, et al.
Published: (2026)

Scaling Open-Vocabulary Object Detection
by: Minderer, Matthias, et al.
Published: (2023)

LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
by: Du, Penghui, et al.
Published: (2024)

Open-Vocabulary Object Detection via Neighboring Region Attention Alignment
by: Qiang, Sunyuan, et al.
Published: (2024)

NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective
by: Qin, Xiaohan, et al.
Published: (2025)

OS-W2S: An Automatic Labeling Engine for Language-Guided Open-Set Aerial Object Detection
by: Wei, Guoting, et al.
Published: (2025)

FACTOR: Counterfactual Training-Free Test-Time Adaptation for Open-Vocabulary Object Detection
by: Zhao, Kaixiang, et al.
Published: (2026)

The Detector Teaches Itself: Lightweight Self-Supervised Adaptation for Open-Vocabulary Object Detection
by: Wan, Yazhe, et al.
Published: (2026)

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach
by: Zhang, Shaofeng, et al.
Published: (2024)

Retrieval-Augmented Open-Vocabulary Object Detection
by: Kim, Jooyeon, et al.
Published: (2024)

Learning to Detect and Segment for Open Vocabulary Object Detection
by: Wang, Tao, et al.
Published: (2022)

Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point Clouds
by: Zhang, Shaofeng, et al.
Published: (2025)

Learning from Unlabelled Data with Transformers: Domain Adaptation for Semantic Segmentation of High Resolution Aerial Images
by: Dionelis, Nikolaos, et al.
Published: (2024)

Efficient Oriented Object Detection with Enhanced Small Object Recognition in Aerial Images
by: Shi, Zhifei, et al.
Published: (2024)