Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Wu, Po-Chih
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2512.22801
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911349736472576
author	Wu, Po-Chih
author_facet	Wu, Po-Chih
contents	Open-vocabulary object detection enables models to localize and recognize objects beyond a predefined set of categories and is expected to achieve recognition capabilities comparable to human performance. In this study, we aim to evaluate the performance of existing models on open-vocabulary object detection tasks under low-quality image conditions. For this purpose, we introduce a new dataset that simulates low-quality images in the real world. In our evaluation experiment, we find that although open-vocabulary object detection models exhibited no significant decrease in mAP scores under low-level image degradation, the performance of all models dropped sharply under high-level image degradation. OWLv2 models consistently performed better across different types of degradation, while OWL-ViT, GroundingDINO, and Detic showed significant performance declines. We will release our dataset and codes to facilitate future studies.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_22801
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Evaluating the Performance of Open-Vocabulary Object Detection in Low-quality Image Wu, Po-Chih Computer Vision and Pattern Recognition Open-vocabulary object detection enables models to localize and recognize objects beyond a predefined set of categories and is expected to achieve recognition capabilities comparable to human performance. In this study, we aim to evaluate the performance of existing models on open-vocabulary object detection tasks under low-quality image conditions. For this purpose, we introduce a new dataset that simulates low-quality images in the real world. In our evaluation experiment, we find that although open-vocabulary object detection models exhibited no significant decrease in mAP scores under low-level image degradation, the performance of all models dropped sharply under high-level image degradation. OWLv2 models consistently performed better across different types of degradation, while OWL-ViT, GroundingDINO, and Detic showed significant performance declines. We will release our dataset and codes to facilitate future studies.
title	Evaluating the Performance of Open-Vocabulary Object Detection in Low-quality Image
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2512.22801

Similar Items