Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Palladino, Anthony, Gajewski, Dana, Aronica, Abigail, Deptula, Patryk, Hamme, Alexander, Lee, Seiyoung C., Muri, Jeff, Nelling, Todd, Riley, Michael A., Wong, Brian, Duff, Margaret
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2411.03491
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912106864967680
author	Palladino, Anthony Gajewski, Dana Aronica, Abigail Deptula, Patryk Hamme, Alexander Lee, Seiyoung C. Muri, Jeff Nelling, Todd Riley, Michael A. Wong, Brian Duff, Margaret
author_facet	Palladino, Anthony Gajewski, Dana Aronica, Abigail Deptula, Patryk Hamme, Alexander Lee, Seiyoung C. Muri, Jeff Nelling, Todd Riley, Michael A. Wong, Brian Duff, Margaret
contents	We present a novel Automatic Target Recognition (ATR) system using open-vocabulary object detection and classification models. A primary advantage of this approach is that target classes can be defined just before runtime by a non-technical end user, using either a few natural language text descriptions of the target, or a few image exemplars, or both. Nuances in the desired targets can be expressed in natural language, which is useful for unique targets with little or no training data. We also implemented a novel combination of several techniques to improve performance, such as leveraging the additional information in the sequence of overlapping frames to perform tubelet identification (i.e., sequential bounding box matching), bounding box re-scoring, and tubelet linking. Additionally, we developed a technique to visualize the aggregate output of many overlapping frames as a mosaic of the area scanned during the aerial surveillance or reconnaissance, and a kernel density estimate (or heatmap) of the detected targets. We initially applied this ATR system to the use case of detecting and clearing unexploded ordinance on airfield runways and we are currently extending our research to other real-world applications.
format	Preprint
id	arxiv_https___arxiv_org_abs_2411_03491
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	An Application-Agnostic Automatic Target Recognition System Using Vision Language Models Palladino, Anthony Gajewski, Dana Aronica, Abigail Deptula, Patryk Hamme, Alexander Lee, Seiyoung C. Muri, Jeff Nelling, Todd Riley, Michael A. Wong, Brian Duff, Margaret Computer Vision and Pattern Recognition We present a novel Automatic Target Recognition (ATR) system using open-vocabulary object detection and classification models. A primary advantage of this approach is that target classes can be defined just before runtime by a non-technical end user, using either a few natural language text descriptions of the target, or a few image exemplars, or both. Nuances in the desired targets can be expressed in natural language, which is useful for unique targets with little or no training data. We also implemented a novel combination of several techniques to improve performance, such as leveraging the additional information in the sequence of overlapping frames to perform tubelet identification (i.e., sequential bounding box matching), bounding box re-scoring, and tubelet linking. Additionally, we developed a technique to visualize the aggregate output of many overlapping frames as a mosaic of the area scanned during the aerial surveillance or reconnaissance, and a kernel density estimate (or heatmap) of the detected targets. We initially applied this ATR system to the use case of detecting and clearing unexploded ordinance on airfield runways and we are currently extending our research to other real-world applications.
title	An Application-Agnostic Automatic Target Recognition System Using Vision Language Models
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2411.03491

Similar Items