Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Feischl, Michael, Rieder, Alexander, Zehetgruber, Fabian
Format:	Preprint
Published:	2024
Subjects:	Numerical Analysis
Online Access:	https://arxiv.org/abs/2407.02242
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910678137176064
author	Feischl, Michael Rieder, Alexander Zehetgruber, Fabian
author_facet	Feischl, Michael Rieder, Alexander Zehetgruber, Fabian
contents	We propose a hierarchical training algorithm for standard feed-forward neural networks that adaptively extends the network architecture as soon as the optimization reaches a stationary point. By solving small (low-dimensional) optimization problems, the extended network provably escapes any local minimum or stationary point. Under some assumptions on the approximability of the data with stable neural networks, we show that the algorithm achieves an optimal convergence rate s in the sense that loss is bounded by the number of parameters to the -s. As a byproduct, we obtain computable indicators which judge the optimality of the training state of a given network and derive a new notion of generalization error.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_02242
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Towards optimal hierarchical training of neural networks Feischl, Michael Rieder, Alexander Zehetgruber, Fabian Numerical Analysis We propose a hierarchical training algorithm for standard feed-forward neural networks that adaptively extends the network architecture as soon as the optimization reaches a stationary point. By solving small (low-dimensional) optimization problems, the extended network provably escapes any local minimum or stationary point. Under some assumptions on the approximability of the data with stable neural networks, we show that the algorithm achieves an optimal convergence rate s in the sense that loss is bounded by the number of parameters to the -s. As a byproduct, we obtain computable indicators which judge the optimality of the training state of a given network and derive a new notion of generalization error.
title	Towards optimal hierarchical training of neural networks
topic	Numerical Analysis
url	https://arxiv.org/abs/2407.02242

Similar Items