Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Xu, Yewei, Chen, Shi, Li, Qin
Format:	Preprint
Published:	2023
Subjects:	Machine Learning Numerical Analysis 65D25 (Primary), 65L06, 90C31 (Secondary)
Online Access:	https://arxiv.org/abs/2306.02192
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911549081255936
author	Xu, Yewei Chen, Shi Li, Qin
author_facet	Xu, Yewei Chen, Shi Li, Qin
contents	Does the use of auto-differentiation yield reasonable updates for deep neural networks (DNNs)? Specifically, when DNNs are designed to adhere to neural ODE architectures, can we trust the gradients provided by auto-differentiation? Through mathematical analysis and numerical evidence, we demonstrate that when neural networks employ high-order methods, such as Linear Multistep Methods (LMM) or Explicit Runge-Kutta Methods (ERK), to approximate the underlying ODE flows, brute-force auto-differentiation often introduces artificial oscillations in the gradients that prevent convergence. In the case of Leapfrog and 2-stage ERK, we propose simple post-processing techniques that effectively eliminates these oscillations, correct the gradient computation and thus returns the accurate updates.
format	Preprint
id	arxiv_https___arxiv_org_abs_2306_02192
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Correcting Auto-Differentiation in Neural-ODE Training Xu, Yewei Chen, Shi Li, Qin Machine Learning Numerical Analysis 65D25 (Primary), 65L06, 90C31 (Secondary) Does the use of auto-differentiation yield reasonable updates for deep neural networks (DNNs)? Specifically, when DNNs are designed to adhere to neural ODE architectures, can we trust the gradients provided by auto-differentiation? Through mathematical analysis and numerical evidence, we demonstrate that when neural networks employ high-order methods, such as Linear Multistep Methods (LMM) or Explicit Runge-Kutta Methods (ERK), to approximate the underlying ODE flows, brute-force auto-differentiation often introduces artificial oscillations in the gradients that prevent convergence. In the case of Leapfrog and 2-stage ERK, we propose simple post-processing techniques that effectively eliminates these oscillations, correct the gradient computation and thus returns the accurate updates.
title	Correcting Auto-Differentiation in Neural-ODE Training
topic	Machine Learning Numerical Analysis 65D25 (Primary), 65L06, 90C31 (Secondary)
url	https://arxiv.org/abs/2306.02192

Similar Items