Enregistré dans:
Détails bibliographiques
Auteurs principaux: Arar, El-Mehdi El, Filip, Silviu-Ioan, Mary, Theo, Riccietti, Elisa
Format: Preprint
Publié: 2025
Sujets:
Accès en ligne:https://arxiv.org/abs/2503.15568
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866917116731457536
author Arar, El-Mehdi El
Filip, Silviu-Ioan
Mary, Theo
Riccietti, Elisa
author_facet Arar, El-Mehdi El
Filip, Silviu-Ioan
Mary, Theo
Riccietti, Elisa
contents This work proposes a mathematically founded mixed precision accumulation strategy for the inference of neural networks. Our strategy is based on a new componentwise forward error analysis that explains the propagation of errors in the forward pass of neural networks. Specifically, our analysis shows that the error in each component of the output of a linear layer is proportional to the condition number of the inner product between the weights and the input, multiplied by the condition number of the activation function. These condition numbers can vary widely from one component to the other, thus creating a significant opportunity to introduce mixed precision: each component should be accumulated in a precision inversely proportional to the product of these condition numbers. We propose a numerical algorithm that exploits this observation: it first computes all components in low precision, uses this output to estimate the condition numbers, and recomputes in higher precision only the components associated with large condition numbers. We test our algorithm on various networks and datasets and confirm experimentally that it can significantly improve the cost--accuracy tradeoff compared with uniform precision accumulation baselines.
format Preprint
id arxiv_https___arxiv_org_abs_2503_15568
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Mixed precision accumulation for neural network inference guided by componentwise forward error analysis
Arar, El-Mehdi El
Filip, Silviu-Ioan
Mary, Theo
Riccietti, Elisa
Machine Learning
Artificial Intelligence
Numerical Analysis
This work proposes a mathematically founded mixed precision accumulation strategy for the inference of neural networks. Our strategy is based on a new componentwise forward error analysis that explains the propagation of errors in the forward pass of neural networks. Specifically, our analysis shows that the error in each component of the output of a linear layer is proportional to the condition number of the inner product between the weights and the input, multiplied by the condition number of the activation function. These condition numbers can vary widely from one component to the other, thus creating a significant opportunity to introduce mixed precision: each component should be accumulated in a precision inversely proportional to the product of these condition numbers. We propose a numerical algorithm that exploits this observation: it first computes all components in low precision, uses this output to estimate the condition numbers, and recomputes in higher precision only the components associated with large condition numbers. We test our algorithm on various networks and datasets and confirm experimentally that it can significantly improve the cost--accuracy tradeoff compared with uniform precision accumulation baselines.
title Mixed precision accumulation for neural network inference guided by componentwise forward error analysis
topic Machine Learning
Artificial Intelligence
Numerical Analysis
url https://arxiv.org/abs/2503.15568