Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Yicheng, Yu, Zixiong, Chen, Guhan, Lin, Qian
Format:	Preprint
Published:	2023
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2305.02657
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914449286234112
author	Li, Yicheng Yu, Zixiong Chen, Guhan Lin, Qian
author_facet	Li, Yicheng Yu, Zixiong Chen, Guhan Lin, Qian
contents	In this paper, we provide a strategy to determine the eigenvalue decay rate (EDR) of a large class of kernel functions defined on a general domain rather than $\mathbb S^{d}$. This class of kernel functions include but are not limited to the neural tangent kernel associated with neural networks with different depths and various activation functions. After proving that the dynamics of training the wide neural networks uniformly approximated that of the neural tangent kernel regression on general domains, we can further illustrate the minimax optimality of the wide neural network provided that the underground truth function $f\in [\mathcal H_{\mathrm{NTK}}]^{s}$, an interpolation space associated with the RKHS $\mathcal{H}_{\mathrm{NTK}}$ of NTK. We also showed that the overfitted neural network can not generalize well. We believe our approach for determining the EDR of kernels might be also of independent interests.
format	Preprint
id	arxiv_https___arxiv_org_abs_2305_02657
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains Li, Yicheng Yu, Zixiong Chen, Guhan Lin, Qian Machine Learning In this paper, we provide a strategy to determine the eigenvalue decay rate (EDR) of a large class of kernel functions defined on a general domain rather than $\mathbb S^{d}$. This class of kernel functions include but are not limited to the neural tangent kernel associated with neural networks with different depths and various activation functions. After proving that the dynamics of training the wide neural networks uniformly approximated that of the neural tangent kernel regression on general domains, we can further illustrate the minimax optimality of the wide neural network provided that the underground truth function $f\in [\mathcal H_{\mathrm{NTK}}]^{s}$, an interpolation space associated with the RKHS $\mathcal{H}_{\mathrm{NTK}}$ of NTK. We also showed that the overfitted neural network can not generalize well. We believe our approach for determining the EDR of kernels might be also of independent interests.
title	On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains
topic	Machine Learning
url	https://arxiv.org/abs/2305.02657

Similar Items