Saved in:
Bibliographic Details
Main Authors: Hasan, Mahmudul, Hossain, Mabsur Fatin Bin
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.01099
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909980907536384
author Hasan, Mahmudul
Hossain, Mabsur Fatin Bin
author_facet Hasan, Mahmudul
Hossain, Mabsur Fatin Bin
contents This paper presents a comparative study of a custom convolutional neural network (CNN) architecture against widely used pretrained and transfer learning CNN models across five real-world image datasets. The datasets span binary classification, fine-grained multiclass recognition, and object detection scenarios. We analyze how architectural factors, such as network depth, residual connections, and feature extraction strategies, influence classification and localization performance. The results show that deeper CNN architectures provide substantial performance gains on fine-grained multiclass datasets, while lightweight pretrained and transfer learning models remain highly effective for simpler binary classification tasks. Additionally, we extend the proposed architecture to an object detection setting, demonstrating its adaptability in identifying unauthorized auto-rickshaws in real-world traffic scenes. Building upon a systematic analysis of custom CNN architectures alongside pretrained and transfer learning models, this study provides practical guidance for selecting suitable network designs based on task complexity and resource constraints.
format Preprint
id arxiv_https___arxiv_org_abs_2601_01099
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Evolving CNN Architectures: From Custom Designs to Deep Residual Models for Diverse Image Classification and Detection Tasks
Hasan, Mahmudul
Hossain, Mabsur Fatin Bin
Computer Vision and Pattern Recognition
Artificial Intelligence
This paper presents a comparative study of a custom convolutional neural network (CNN) architecture against widely used pretrained and transfer learning CNN models across five real-world image datasets. The datasets span binary classification, fine-grained multiclass recognition, and object detection scenarios. We analyze how architectural factors, such as network depth, residual connections, and feature extraction strategies, influence classification and localization performance. The results show that deeper CNN architectures provide substantial performance gains on fine-grained multiclass datasets, while lightweight pretrained and transfer learning models remain highly effective for simpler binary classification tasks. Additionally, we extend the proposed architecture to an object detection setting, demonstrating its adaptability in identifying unauthorized auto-rickshaws in real-world traffic scenes. Building upon a systematic analysis of custom CNN architectures alongside pretrained and transfer learning models, this study provides practical guidance for selecting suitable network designs based on task complexity and resource constraints.
title Evolving CNN Architectures: From Custom Designs to Deep Residual Models for Diverse Image Classification and Detection Tasks
topic Computer Vision and Pattern Recognition
Artificial Intelligence
url https://arxiv.org/abs/2601.01099