Enregistré dans:
| Auteurs principaux: | , , , , , , |
|---|---|
| Format: | Preprint |
| Publié: |
2025
|
| Sujets: | |
| Accès en ligne: | https://arxiv.org/abs/2503.17285 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
| _version_ | 1866912286325604352 |
|---|---|
| author | Kim, Louis Y. Karker, Michelle Valledor, Victoria Lee, Seiyoung C. Brzoska, Karl F. Duff, Margaret Palladino, Anthony |
| author_facet | Kim, Louis Y. Karker, Michelle Valledor, Victoria Lee, Seiyoung C. Brzoska, Karl F. Duff, Margaret Palladino, Anthony |
| contents | Recent advances in open-vocabulary object detection models will enable Automatic Target Recognition systems to be sustainable and repurposed by non-technical end-users for a variety of applications or missions. New, and potentially nuanced, classes can be defined with natural language text descriptions in the field, immediately before runtime, without needing to retrain the model. We present an approach for improving non-technical users' natural language text descriptions of their desired targets of interest, using a combination of analysis techniques on the text embeddings, and proper combinations of embeddings for contrastive examples. We quantify the improvement that our feedback mechanism provides by demonstrating performance with multiple publicly-available open-vocabulary object detection models. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2503_17285 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection Kim, Louis Y. Karker, Michelle Valledor, Victoria Lee, Seiyoung C. Brzoska, Karl F. Duff, Margaret Palladino, Anthony Computer Vision and Pattern Recognition Computation and Language Recent advances in open-vocabulary object detection models will enable Automatic Target Recognition systems to be sustainable and repurposed by non-technical end-users for a variety of applications or missions. New, and potentially nuanced, classes can be defined with natural language text descriptions in the field, immediately before runtime, without needing to retrain the model. We present an approach for improving non-technical users' natural language text descriptions of their desired targets of interest, using a combination of analysis techniques on the text embeddings, and proper combinations of embeddings for contrastive examples. We quantify the improvement that our feedback mechanism provides by demonstrating performance with multiple publicly-available open-vocabulary object detection models. |
| title | An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection |
| topic | Computer Vision and Pattern Recognition Computation and Language |
| url | https://arxiv.org/abs/2503.17285 |