Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Feng, Chen, Zhuo, Shaojie, Zhang, Xiaopeng, Ramakrishnan, Ramchalam Kinattinkara, Yuan, Zhaocong, Li, Andrew Zou
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Machine Learning
Online-Zugang:	https://arxiv.org/abs/2411.04036
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866915007723208704
author	Feng, Chen Zhuo, Shaojie Zhang, Xiaopeng Ramakrishnan, Ramchalam Kinattinkara Yuan, Zhaocong Li, Andrew Zou
author_facet	Feng, Chen Zhuo, Shaojie Zhang, Xiaopeng Ramakrishnan, Ramchalam Kinattinkara Yuan, Zhaocong Li, Andrew Zou
contents	Continuously adapting pre-trained models to local data on resource constrained edge devices is the $\emph{last mile}$ for model deployment. However, as models increase in size and depth, backpropagation requires a large amount of memory, which becomes prohibitive for edge devices. In addition, most existing low power neural processing engines (e.g., NPUs, DSPs, MCUs, etc.) are designed as fixed-point inference accelerators, without training capabilities. Forward gradients, solely based on directional derivatives computed from two forward calls, have been recently used for model training, with substantial savings in computation and memory. However, the performance of quantized training with fixed-point forward gradients remains unclear. In this paper, we investigate the feasibility of on-device training using fixed-point forward gradients, by conducting comprehensive experiments across a variety of deep learning benchmark tasks in both vision and audio domains. We propose a series of algorithm enhancements that further reduce the memory footprint, and the accuracy gap compared to backpropagation. An empirical study on how training with forward gradients navigates in the loss landscape is further explored. Our results demonstrate that on the last mile of model customization on edge devices, training with fixed-point forward gradients is a feasible and practical approach.
format	Preprint
id	arxiv_https___arxiv_org_abs_2411_04036
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Stepping Forward on the Last Mile Feng, Chen Zhuo, Shaojie Zhang, Xiaopeng Ramakrishnan, Ramchalam Kinattinkara Yuan, Zhaocong Li, Andrew Zou Machine Learning Continuously adapting pre-trained models to local data on resource constrained edge devices is the $\emph{last mile}$ for model deployment. However, as models increase in size and depth, backpropagation requires a large amount of memory, which becomes prohibitive for edge devices. In addition, most existing low power neural processing engines (e.g., NPUs, DSPs, MCUs, etc.) are designed as fixed-point inference accelerators, without training capabilities. Forward gradients, solely based on directional derivatives computed from two forward calls, have been recently used for model training, with substantial savings in computation and memory. However, the performance of quantized training with fixed-point forward gradients remains unclear. In this paper, we investigate the feasibility of on-device training using fixed-point forward gradients, by conducting comprehensive experiments across a variety of deep learning benchmark tasks in both vision and audio domains. We propose a series of algorithm enhancements that further reduce the memory footprint, and the accuracy gap compared to backpropagation. An empirical study on how training with forward gradients navigates in the loss landscape is further explored. Our results demonstrate that on the last mile of model customization on edge devices, training with fixed-point forward gradients is a feasible and practical approach.
title	Stepping Forward on the Last Mile
topic	Machine Learning
url	https://arxiv.org/abs/2411.04036

Ähnliche Einträge