Saved in:
| Main Authors: | Shi, Cheng, Zhu, Yuchen, Yang, Sibei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.10083 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models
by: Shi, Cheng, et al.
Published: (2024)
by: Shi, Cheng, et al.
Published: (2024)
Vision Function Layer in Multimodal LLMs
by: Shi, Cheng, et al.
Published: (2025)
by: Shi, Cheng, et al.
Published: (2025)
Vision Transformers Need More Than Registers
by: Shi, Cheng, et al.
Published: (2026)
by: Shi, Cheng, et al.
Published: (2026)
WeaveTime: Stream from Earlier Frames into Emergent Memory in VideoLLMs
by: Zhang, Yulin, et al.
Published: (2026)
by: Zhang, Yulin, et al.
Published: (2026)
Part2Object: Hierarchical Unsupervised 3D Instance Segmentation
by: Shi, Cheng, et al.
Published: (2024)
by: Shi, Cheng, et al.
Published: (2024)
Sim-DETR: Unlock DETR for Temporal Sentence Grounding
by: Tang, Jiajin, et al.
Published: (2025)
by: Tang, Jiajin, et al.
Published: (2025)
RGBX-DiffusionDet: A Framework for Multi-Modal RGB-X Object Detection Using DiffusionDet
by: Orfaig, Eliraz, et al.
Published: (2025)
by: Orfaig, Eliraz, et al.
Published: (2025)
Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats
by: Qian, Jiaye, et al.
Published: (2025)
by: Qian, Jiaye, et al.
Published: (2025)
Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video
by: Zhang, Yulin, et al.
Published: (2025)
by: Zhang, Yulin, et al.
Published: (2025)
Rethinking Query-based Transformer for Continual Image Segmentation
by: Zhu, Yuchen, et al.
Published: (2025)
by: Zhu, Yuchen, et al.
Published: (2025)
SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection
by: Li, Yuxuan, et al.
Published: (2024)
by: Li, Yuxuan, et al.
Published: (2024)
Penalizing Boundary Activation for Object Completeness in Diffusion Models
by: Xu, Haoyang, et al.
Published: (2025)
by: Xu, Haoyang, et al.
Published: (2025)
No More Sibling Rivalry: Debiasing Human-Object Interaction Detection
by: Yang, Bin, et al.
Published: (2025)
by: Yang, Bin, et al.
Published: (2025)
DCCS-Det: Directional Context and Cross-Scale-Aware Detector for Infrared Small Target
by: Li, Shuying, et al.
Published: (2026)
by: Li, Shuying, et al.
Published: (2026)
CerberusDet: Unified Multi-Dataset Object Detection
by: Tolstykh, Irina, et al.
Published: (2024)
by: Tolstykh, Irina, et al.
Published: (2024)
RemDet: Rethinking Efficient Model Design for UAV Object Detection
by: Li, Chen, et al.
Published: (2024)
by: Li, Chen, et al.
Published: (2024)
ChangeViT: Unleashing Plain Vision Transformers for Change Detection
by: Zhu, Duowang, et al.
Published: (2024)
by: Zhu, Duowang, et al.
Published: (2024)
InfoDet: A Dataset for Infographic Element Detection
by: Zhu, Jiangning, et al.
Published: (2025)
by: Zhu, Jiangning, et al.
Published: (2025)
SimPLR: A Simple and Plain Transformer for Efficient Object Detection and Segmentation
by: Nguyen, Duy-Kien, et al.
Published: (2023)
by: Nguyen, Duy-Kien, et al.
Published: (2023)
SpirDet: Towards Efficient, Accurate and Lightweight Infrared Small Target Detector
by: Mao, Qianchen, et al.
Published: (2024)
by: Mao, Qianchen, et al.
Published: (2024)
Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation
by: Dai, Qiyuan, et al.
Published: (2024)
by: Dai, Qiyuan, et al.
Published: (2024)
Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM
by: Dai, Qiyuan, et al.
Published: (2025)
by: Dai, Qiyuan, et al.
Published: (2025)
UniDet3D: Multi-dataset Indoor 3D Object Detection
by: Kolodiazhnyi, Maksim, et al.
Published: (2024)
by: Kolodiazhnyi, Maksim, et al.
Published: (2024)
DetPO: In-Context Learning with Multi-Modal LLMs for Few-Shot Object Detection
by: Gare, Gautam Rajendrakumar, et al.
Published: (2026)
by: Gare, Gautam Rajendrakumar, et al.
Published: (2026)
SaccadeDet: A Novel Dual-Stage Architecture for Rapid and Accurate Detection in Gigapixel Images
by: Li, Wenxi, et al.
Published: (2024)
by: Li, Wenxi, et al.
Published: (2024)
MatchDet: A Collaborative Framework for Image Matching and Object Detection
by: Lai, Jinxiang, et al.
Published: (2023)
by: Lai, Jinxiang, et al.
Published: (2023)
Exploring Plain ViT Reconstruction for Multi-class Unsupervised Anomaly Detection
by: Zhang, Jiangning, et al.
Published: (2023)
by: Zhang, Jiangning, et al.
Published: (2023)
DenoDet V2: Phase-Amplitude Cross Denoising for SAR Object Detection
by: Ni, Kang, et al.
Published: (2025)
by: Ni, Kang, et al.
Published: (2025)
HazyDet: Open-Source Benchmark for Drone-View Object Detection with Depth-Cues in Hazy Scenes
by: Feng, Changfeng, et al.
Published: (2024)
by: Feng, Changfeng, et al.
Published: (2024)
FMG-Det: Foundation Model Guided Robust Object Detection
by: Hannan, Darryl, et al.
Published: (2025)
by: Hannan, Darryl, et al.
Published: (2025)
FlowDet: Unifying Object Detection and Generative Transport Flows
by: Baty, Enis, et al.
Published: (2025)
by: Baty, Enis, et al.
Published: (2025)
GVSynergy-Det: Synergistic Gaussian-Voxel Representations for Multi-View 3D Object Detection
by: Zhang, Yi, et al.
Published: (2025)
by: Zhang, Yi, et al.
Published: (2025)
VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection
by: Li, Wuyang, et al.
Published: (2025)
by: Li, Wuyang, et al.
Published: (2025)
AuxDet: Auxiliary Metadata Matters for Omni-Domain Infrared Small Target Detection
by: Shi, Yangting, et al.
Published: (2025)
by: Shi, Yangting, et al.
Published: (2025)
CitDet: A Benchmark Dataset for Citrus Fruit Detection
by: James, Jordan A., et al.
Published: (2023)
by: James, Jordan A., et al.
Published: (2023)
WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language
by: Lin, Zhenxiang, et al.
Published: (2023)
by: Lin, Zhenxiang, et al.
Published: (2023)
M^3-GloDets: Multi-Region and Multi-Scale Analysis of Fine-Grained Diseased Glomerular Detection
by: Shi, Tianyu, et al.
Published: (2025)
by: Shi, Tianyu, et al.
Published: (2025)
GenDet: Painting Colored Bounding Boxes on Images via Diffusion Model for Object Detection
by: Min, Chen, et al.
Published: (2026)
by: Min, Chen, et al.
Published: (2026)
Comparison Of Deep Object Detectors On A New Vulnerable Pedestrian Dataset
by: Sharma, Devansh, et al.
Published: (2022)
by: Sharma, Devansh, et al.
Published: (2022)
LMM-Det: Make Large Multimodal Models Excel in Object Detection
by: Li, Jincheng, et al.
Published: (2025)
by: Li, Jincheng, et al.
Published: (2025)
Similar Items
-
The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models
by: Shi, Cheng, et al.
Published: (2024) -
Vision Function Layer in Multimodal LLMs
by: Shi, Cheng, et al.
Published: (2025) -
Vision Transformers Need More Than Registers
by: Shi, Cheng, et al.
Published: (2026) -
WeaveTime: Stream from Earlier Frames into Emergent Memory in VideoLLMs
by: Zhang, Yulin, et al.
Published: (2026) -
Part2Object: Hierarchical Unsupervised 3D Instance Segmentation
by: Shi, Cheng, et al.
Published: (2024)