Saved in:
Bibliographic Details
Main Authors: Soen, Alexander, Sun, Ke
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.05379
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909371059929088
author Soen, Alexander
Sun, Ke
author_facet Soen, Alexander
Sun, Ke
contents The Fisher information matrix can be used to characterize the local geometry of the parameter space of neural networks. It elucidates insightful theories and useful tools to understand and optimize neural networks. Given its high computational cost, practitioners often use random estimators and evaluate only the diagonal entries. We examine two popular estimators whose accuracy and sample complexity depend on their associated variances. We derive bounds of the variances and instantiate them in neural networks for regression and classification. We navigate trade-offs for both estimators based on analytical and numerical studies. We find that the variance quantities depend on the non-linearity wrt different parameter groups and should not be neglected when estimating the Fisher information.
format Preprint
id arxiv_https___arxiv_org_abs_2402_05379
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Trade-Offs of Diagonal Fisher Information Matrix Estimators
Soen, Alexander
Sun, Ke
Machine Learning
The Fisher information matrix can be used to characterize the local geometry of the parameter space of neural networks. It elucidates insightful theories and useful tools to understand and optimize neural networks. Given its high computational cost, practitioners often use random estimators and evaluate only the diagonal entries. We examine two popular estimators whose accuracy and sample complexity depend on their associated variances. We derive bounds of the variances and instantiate them in neural networks for regression and classification. We navigate trade-offs for both estimators based on analytical and numerical studies. We find that the variance quantities depend on the non-linearity wrt different parameter groups and should not be neglected when estimating the Fisher information.
title Trade-Offs of Diagonal Fisher Information Matrix Estimators
topic Machine Learning
url https://arxiv.org/abs/2402.05379