Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Sah, Sudhakar, Chabbra, Nikhil, Durnerin, Matthieu
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.11716
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918202386153472
author	Sah, Sudhakar Chabbra, Nikhil Durnerin, Matthieu
author_facet	Sah, Sudhakar Chabbra, Nikhil Durnerin, Matthieu
contents	Deep Convolutional Neural Networks (CNNs) are increasingly difficult to deploy on microcontrollers (MCUs) and lightweight NPUs (Neural Processing Units) due to their growing size and compute demands. Low-rank tensor decomposition, such as Tucker factorization, is a promising way to reduce parameters and operations with reasonable accuracy loss. However, existing approaches select ranks locally and often ignore global trade-offs between compression and accuracy. We introduce CompressNAS, a MicroNAS-inspired framework that treats rank selection as a global search problem. CompressNAS employs a fast accuracy estimator to evaluate candidate decompositions, enabling efficient yet exhaustive rank exploration under memory and accuracy constraints. In ImageNet, CompressNAS compresses ResNet-18 by 8x with less than 4% accuracy drop; on COCO, we achieve 2x compression of YOLOv5s without any accuracy drop and 2x compression of YOLOv5n with a 2.5% drop. Finally, we present a new family of compressed models, STResNet, with competitive performance compared to other efficient models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_11716
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	CompressNAS : A Fast and Efficient Technique for Model Compression using Decomposition Sah, Sudhakar Chabbra, Nikhil Durnerin, Matthieu Computer Vision and Pattern Recognition Deep Convolutional Neural Networks (CNNs) are increasingly difficult to deploy on microcontrollers (MCUs) and lightweight NPUs (Neural Processing Units) due to their growing size and compute demands. Low-rank tensor decomposition, such as Tucker factorization, is a promising way to reduce parameters and operations with reasonable accuracy loss. However, existing approaches select ranks locally and often ignore global trade-offs between compression and accuracy. We introduce CompressNAS, a MicroNAS-inspired framework that treats rank selection as a global search problem. CompressNAS employs a fast accuracy estimator to evaluate candidate decompositions, enabling efficient yet exhaustive rank exploration under memory and accuracy constraints. In ImageNet, CompressNAS compresses ResNet-18 by 8x with less than 4% accuracy drop; on COCO, we achieve 2x compression of YOLOv5s without any accuracy drop and 2x compression of YOLOv5n with a 2.5% drop. Finally, we present a new family of compressed models, STResNet, with competitive performance compared to other efficient models.
title	CompressNAS : A Fast and Efficient Technique for Model Compression using Decomposition
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2511.11716

Similar Items