Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Kim, Louis Y., Karker, Michelle, Valledor, Victoria, Lee, Seiyoung C., Brzoska, Karl F., Duff, Margaret, Palladino, Anthony
Format:	Preprint
Publié:	2025
Sujets:	Computer Vision and Pattern Recognition Computation and Language
Accès en ligne:	https://arxiv.org/abs/2503.17285
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866912286325604352
author	Kim, Louis Y. Karker, Michelle Valledor, Victoria Lee, Seiyoung C. Brzoska, Karl F. Duff, Margaret Palladino, Anthony
author_facet	Kim, Louis Y. Karker, Michelle Valledor, Victoria Lee, Seiyoung C. Brzoska, Karl F. Duff, Margaret Palladino, Anthony
contents	Recent advances in open-vocabulary object detection models will enable Automatic Target Recognition systems to be sustainable and repurposed by non-technical end-users for a variety of applications or missions. New, and potentially nuanced, classes can be defined with natural language text descriptions in the field, immediately before runtime, without needing to retrain the model. We present an approach for improving non-technical users' natural language text descriptions of their desired targets of interest, using a combination of analysis techniques on the text embeddings, and proper combinations of embeddings for contrastive examples. We quantify the improvement that our feedback mechanism provides by demonstrating performance with multiple publicly-available open-vocabulary object detection models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_17285
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection Kim, Louis Y. Karker, Michelle Valledor, Victoria Lee, Seiyoung C. Brzoska, Karl F. Duff, Margaret Palladino, Anthony Computer Vision and Pattern Recognition Computation and Language Recent advances in open-vocabulary object detection models will enable Automatic Target Recognition systems to be sustainable and repurposed by non-technical end-users for a variety of applications or missions. New, and potentially nuanced, classes can be defined with natural language text descriptions in the field, immediately before runtime, without needing to retrain the model. We present an approach for improving non-technical users' natural language text descriptions of their desired targets of interest, using a combination of analysis techniques on the text embeddings, and proper combinations of embeddings for contrastive examples. We quantify the improvement that our feedback mechanism provides by demonstrating performance with multiple publicly-available open-vocabulary object detection models.
title	An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection
topic	Computer Vision and Pattern Recognition Computation and Language
url	https://arxiv.org/abs/2503.17285

Documents similaires