Enregistré dans:
Détails bibliographiques
Auteurs principaux: Kim, Louis Y., Karker, Michelle, Valledor, Victoria, Lee, Seiyoung C., Brzoska, Karl F., Duff, Margaret, Palladino, Anthony
Format: Preprint
Publié: 2025
Sujets:
Accès en ligne:https://arxiv.org/abs/2503.17285
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866912286325604352
author Kim, Louis Y.
Karker, Michelle
Valledor, Victoria
Lee, Seiyoung C.
Brzoska, Karl F.
Duff, Margaret
Palladino, Anthony
author_facet Kim, Louis Y.
Karker, Michelle
Valledor, Victoria
Lee, Seiyoung C.
Brzoska, Karl F.
Duff, Margaret
Palladino, Anthony
contents Recent advances in open-vocabulary object detection models will enable Automatic Target Recognition systems to be sustainable and repurposed by non-technical end-users for a variety of applications or missions. New, and potentially nuanced, classes can be defined with natural language text descriptions in the field, immediately before runtime, without needing to retrain the model. We present an approach for improving non-technical users' natural language text descriptions of their desired targets of interest, using a combination of analysis techniques on the text embeddings, and proper combinations of embeddings for contrastive examples. We quantify the improvement that our feedback mechanism provides by demonstrating performance with multiple publicly-available open-vocabulary object detection models.
format Preprint
id arxiv_https___arxiv_org_abs_2503_17285
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection
Kim, Louis Y.
Karker, Michelle
Valledor, Victoria
Lee, Seiyoung C.
Brzoska, Karl F.
Duff, Margaret
Palladino, Anthony
Computer Vision and Pattern Recognition
Computation and Language
Recent advances in open-vocabulary object detection models will enable Automatic Target Recognition systems to be sustainable and repurposed by non-technical end-users for a variety of applications or missions. New, and potentially nuanced, classes can be defined with natural language text descriptions in the field, immediately before runtime, without needing to retrain the model. We present an approach for improving non-technical users' natural language text descriptions of their desired targets of interest, using a combination of analysis techniques on the text embeddings, and proper combinations of embeddings for contrastive examples. We quantify the improvement that our feedback mechanism provides by demonstrating performance with multiple publicly-available open-vocabulary object detection models.
title An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection
topic Computer Vision and Pattern Recognition
Computation and Language
url https://arxiv.org/abs/2503.17285