Saved in:
Bibliographic Details
Main Authors: Li, Yang, Xiao, Jianli
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.01093
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908037303762944
author Li, Yang
Xiao, Jianli
author_facet Li, Yang
Xiao, Jianli
contents Accurate real-time object detection enhances the safety of advanced driver-assistance systems, making it an essential component in driving scenarios. With the rapid development of deep learning technology, CNN-based YOLO real-time object detectors have gained significant attention. However, the local focus of CNNs results in performance bottlenecks. To further enhance detector performance, researchers have introduced Transformer-based self-attention mechanisms to leverage global receptive fields, but their quadratic complexity incurs substantial computational costs. Recently, Mamba, with its linear complexity, has made significant progress through global selective scanning. Inspired by Mamba's outstanding performance, we propose a novel object detector: DS MYOLO. This detector captures global feature information through a simplified selective scanning fusion block (SimVSS Block) and effectively integrates the network's deep features. Additionally, we introduce an efficient channel attention convolution (ECAConv) that enhances cross-channel feature interaction while maintaining low computational complexity. Extensive experiments on the CCTSDB 2021 and VLD-45 driving scenarios datasets demonstrate that DS MYOLO exhibits significant potential and competitive advantage among similarly scaled YOLO series real-time object detectors.
format Preprint
id arxiv_https___arxiv_org_abs_2409_01093
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios
Li, Yang
Xiao, Jianli
Computer Vision and Pattern Recognition
Artificial Intelligence
Accurate real-time object detection enhances the safety of advanced driver-assistance systems, making it an essential component in driving scenarios. With the rapid development of deep learning technology, CNN-based YOLO real-time object detectors have gained significant attention. However, the local focus of CNNs results in performance bottlenecks. To further enhance detector performance, researchers have introduced Transformer-based self-attention mechanisms to leverage global receptive fields, but their quadratic complexity incurs substantial computational costs. Recently, Mamba, with its linear complexity, has made significant progress through global selective scanning. Inspired by Mamba's outstanding performance, we propose a novel object detector: DS MYOLO. This detector captures global feature information through a simplified selective scanning fusion block (SimVSS Block) and effectively integrates the network's deep features. Additionally, we introduce an efficient channel attention convolution (ECAConv) that enhances cross-channel feature interaction while maintaining low computational complexity. Extensive experiments on the CCTSDB 2021 and VLD-45 driving scenarios datasets demonstrate that DS MYOLO exhibits significant potential and competitive advantage among similarly scaled YOLO series real-time object detectors.
title DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios
topic Computer Vision and Pattern Recognition
Artificial Intelligence
url https://arxiv.org/abs/2409.01093