Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Asselin, Pierre-Luc, Coulombe, Vincent, Guimont-Martin, William, Larrivée-Hardy, William
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2405.06911
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918135367467008
author	Asselin, Pierre-Luc Coulombe, Vincent Guimont-Martin, William Larrivée-Hardy, William
author_facet	Asselin, Pierre-Luc Coulombe, Vincent Guimont-Martin, William Larrivée-Hardy, William
contents	This work examines the reproducibility and benchmarking of state-of-the-art real-time object detection models. As object detection models are often used in real-world contexts, such as robotics, where inference time is paramount, simply measuring models' accuracy is not enough to compare them. We thus compare a large variety of object detection models' accuracy and inference speed on multiple graphics cards. In addition to this large benchmarking attempt, we also reproduce the following models from scratch using PyTorch on the MS COCO 2017 dataset: DETR, RTMDet, ViTDet and YOLOv7. More importantly, we propose a unified training and evaluation pipeline, based on MMDetection's features, to better compare models. Our implementation of DETR and ViTDet could not achieve accuracy or speed performances comparable to what is declared in the original papers. On the other hand, reproduced RTMDet and YOLOv7 could match such performances. Studied papers are also found to be generally lacking for reproducibility purposes. As for MMDetection pretrained models, speed performances are severely reduced with limited computing resources (larger, more accurate models even more so). Moreover, results exhibit a strong trade-off between accuracy and speed, prevailed by anchor-free models - notably RTMDet or YOLOx models. The code used is this paper and all the experiments is available in the repository at https://github.com/willGuimont/segdet_mlcr2024.
format	Preprint
id	arxiv_https___arxiv_org_abs_2405_06911
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Replication Study and Benchmarking of Real-Time Object Detection Models Asselin, Pierre-Luc Coulombe, Vincent Guimont-Martin, William Larrivée-Hardy, William Computer Vision and Pattern Recognition This work examines the reproducibility and benchmarking of state-of-the-art real-time object detection models. As object detection models are often used in real-world contexts, such as robotics, where inference time is paramount, simply measuring models' accuracy is not enough to compare them. We thus compare a large variety of object detection models' accuracy and inference speed on multiple graphics cards. In addition to this large benchmarking attempt, we also reproduce the following models from scratch using PyTorch on the MS COCO 2017 dataset: DETR, RTMDet, ViTDet and YOLOv7. More importantly, we propose a unified training and evaluation pipeline, based on MMDetection's features, to better compare models. Our implementation of DETR and ViTDet could not achieve accuracy or speed performances comparable to what is declared in the original papers. On the other hand, reproduced RTMDet and YOLOv7 could match such performances. Studied papers are also found to be generally lacking for reproducibility purposes. As for MMDetection pretrained models, speed performances are severely reduced with limited computing resources (larger, more accurate models even more so). Moreover, results exhibit a strong trade-off between accuracy and speed, prevailed by anchor-free models - notably RTMDet or YOLOx models. The code used is this paper and all the experiments is available in the repository at https://github.com/willGuimont/segdet_mlcr2024.
title	Replication Study and Benchmarking of Real-Time Object Detection Models
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2405.06911

Similar Items