Saved in:
Bibliographic Details
Main Authors: Jadhav, Atharva, Karekar, Arush, Divekar, Manas, Natu, Shachi
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.23697
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909813011644416
author Jadhav, Atharva
Karekar, Arush
Divekar, Manas
Natu, Shachi
author_facet Jadhav, Atharva
Karekar, Arush
Divekar, Manas
Natu, Shachi
contents The safety and security of public spaces is of vital importance, driving the need for sophisticated surveillance systems capable of accurately detecting weapons, which are often hampered by issues like partial occlusion, varying lighting, and cluttered backgrounds. While single-model detectors are advanced, they often lack robustness in these challenging conditions. This paper presents the hypothesis that ensemble of Single Shot Multibox Detector (SSD) models with diverse feature extraction backbones can significantly enhance detection robustness. To leverage diverse feature representations, individual SSD models were trained using a selection of backbone networks: VGG16, ResNet50, EfficientNet, and MobileNetV3. The study is conducted on a dataset consisting of images of three distinct weapon classes: guns, heavy weapons and knives. The predictions from these models are combined using the Weighted Boxes Fusion (WBF) method, an ensemble technique designed to optimize bounding box accuracy. Our key finding is that the fusion strategy is as critical as the ensemble's diversity, a WBF approach using a 'max' confidence scoring strategy achieved a mean Average Precision (mAP) of 0.838. This represents a 2.948% relative improvement over the best-performing single model and consistently outperforms other fusion heuristics. This research offers a robust approach to enhancing real-time weapon detection capabilities in surveillance applications by demonstrating that confidence-aware fusion is a key mechanism for improving accuracy metrics of ensembles.
format Preprint
id arxiv_https___arxiv_org_abs_2509_23697
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Confidence Aware SSD Ensemble with Weighted Boxes Fusion for Weapon Detection
Jadhav, Atharva
Karekar, Arush
Divekar, Manas
Natu, Shachi
Computer Vision and Pattern Recognition
Machine Learning
The safety and security of public spaces is of vital importance, driving the need for sophisticated surveillance systems capable of accurately detecting weapons, which are often hampered by issues like partial occlusion, varying lighting, and cluttered backgrounds. While single-model detectors are advanced, they often lack robustness in these challenging conditions. This paper presents the hypothesis that ensemble of Single Shot Multibox Detector (SSD) models with diverse feature extraction backbones can significantly enhance detection robustness. To leverage diverse feature representations, individual SSD models were trained using a selection of backbone networks: VGG16, ResNet50, EfficientNet, and MobileNetV3. The study is conducted on a dataset consisting of images of three distinct weapon classes: guns, heavy weapons and knives. The predictions from these models are combined using the Weighted Boxes Fusion (WBF) method, an ensemble technique designed to optimize bounding box accuracy. Our key finding is that the fusion strategy is as critical as the ensemble's diversity, a WBF approach using a 'max' confidence scoring strategy achieved a mean Average Precision (mAP) of 0.838. This represents a 2.948% relative improvement over the best-performing single model and consistently outperforms other fusion heuristics. This research offers a robust approach to enhancing real-time weapon detection capabilities in surveillance applications by demonstrating that confidence-aware fusion is a key mechanism for improving accuracy metrics of ensembles.
title Confidence Aware SSD Ensemble with Weighted Boxes Fusion for Weapon Detection
topic Computer Vision and Pattern Recognition
Machine Learning
url https://arxiv.org/abs/2509.23697