Saved in:
Bibliographic Details
Main Authors: Verma, Tushar, Singh, Jyotsna, Bhartari, Yash, Jarwal, Rishi, Singh, Suraj, Singh, Shubhkarman
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2405.01699
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916236171935744
author Verma, Tushar
Singh, Jyotsna
Bhartari, Yash
Jarwal, Rishi
Singh, Suraj
Singh, Shubhkarman
author_facet Verma, Tushar
Singh, Jyotsna
Bhartari, Yash
Jarwal, Rishi
Singh, Suraj
Singh, Shubhkarman
contents Small object detection in aerial imagery presents significant challenges in computer vision due to the minimal data inherent in small-sized objects and their propensity to be obscured by larger objects and background noise. Traditional methods using transformer-based models often face limitations stemming from the lack of specialized databases, which adversely affect their performance with objects of varying orientations and scales. This underscores the need for more adaptable, lightweight models. In response, this paper introduces two innovative approaches that significantly enhance detection and segmentation capabilities for small aerial objects. Firstly, we explore the use of the SAHI framework on the newly introduced lightweight YOLO v9 architecture, which utilizes Programmable Gradient Information (PGI) to reduce the substantial information loss typically encountered in sequential feature extraction processes. The paper employs the Vision Mamba model, which incorporates position embeddings to facilitate precise location-aware visual understanding, combined with a novel bidirectional State Space Model (SSM) for effective visual context modeling. This State Space Model adeptly harnesses the linear complexity of CNNs and the global receptive field of Transformers, making it particularly effective in remote sensing image classification. Our experimental results demonstrate substantial improvements in detection accuracy and processing efficiency, validating the applicability of these approaches for real-time small object detection across diverse aerial scenarios. This paper also discusses how these methodologies could serve as foundational models for future advancements in aerial object recognition technologies. The source code will be made accessible here.
format Preprint
id arxiv_https___arxiv_org_abs_2405_01699
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients
Verma, Tushar
Singh, Jyotsna
Bhartari, Yash
Jarwal, Rishi
Singh, Suraj
Singh, Shubhkarman
Computer Vision and Pattern Recognition
Artificial Intelligence
Small object detection in aerial imagery presents significant challenges in computer vision due to the minimal data inherent in small-sized objects and their propensity to be obscured by larger objects and background noise. Traditional methods using transformer-based models often face limitations stemming from the lack of specialized databases, which adversely affect their performance with objects of varying orientations and scales. This underscores the need for more adaptable, lightweight models. In response, this paper introduces two innovative approaches that significantly enhance detection and segmentation capabilities for small aerial objects. Firstly, we explore the use of the SAHI framework on the newly introduced lightweight YOLO v9 architecture, which utilizes Programmable Gradient Information (PGI) to reduce the substantial information loss typically encountered in sequential feature extraction processes. The paper employs the Vision Mamba model, which incorporates position embeddings to facilitate precise location-aware visual understanding, combined with a novel bidirectional State Space Model (SSM) for effective visual context modeling. This State Space Model adeptly harnesses the linear complexity of CNNs and the global receptive field of Transformers, making it particularly effective in remote sensing image classification. Our experimental results demonstrate substantial improvements in detection accuracy and processing efficiency, validating the applicability of these approaches for real-time small object detection across diverse aerial scenarios. This paper also discusses how these methodologies could serve as foundational models for future advancements in aerial object recognition technologies. The source code will be made accessible here.
title SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients
topic Computer Vision and Pattern Recognition
Artificial Intelligence
url https://arxiv.org/abs/2405.01699