Saved in:
Bibliographic Details
Main Authors: Lee, Chankyu, Choi, Woohyun, Park, Sangwook
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.01698
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • This study evaluates the inference performance of various deep learning models under an embedded system environment. In previous works, Multiply-Accumulate operation is typically used to measure computational load of a deep model. According to this study, however, this metric has a limitation to estimate inference time on embedded devices. This paper poses the question of what aspects are overlooked when expressed in terms of Multiply-Accumulate operations. In experiments, an image classification task is performed on an embedded system device using the CIFAR-100 dataset to compare and analyze the inference times of ten deep models with the theoretically calculated Multiply-Accumulate operations for each model. The results highlight the importance of considering additional computations between tensors when optimizing deep learning models for real-time performing in embedded systems.