Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yang, Yunfei, Zhou, Ding-Xuan
Format:	Preprint
Published:	2023
Subjects:	Machine Learning Statistics Theory
Online Access:	https://arxiv.org/abs/2306.08321
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914822309806080
author	Yang, Yunfei Zhou, Ding-Xuan
author_facet	Yang, Yunfei Zhou, Ding-Xuan
contents	It is shown that over-parameterized neural networks can achieve minimax optimal rates of convergence (up to logarithmic factors) for learning functions from certain smooth function classes, if the weights are suitably constrained or regularized. Specifically, we consider the nonparametric regression of estimating an unknown $d$-variate function by using shallow ReLU neural networks. It is assumed that the regression function is from the Hölder space with smoothness $α<(d+3)/2$ or a variation space corresponding to shallow neural networks, which can be viewed as an infinitely wide neural network. In this setting, we prove that least squares estimators based on shallow neural networks with certain norm constraints on the weights are minimax optimal, if the network width is sufficiently large. As a byproduct, we derive a new size-independent bound for the local Rademacher complexity of shallow ReLU neural networks, which may be of independent interest.
format	Preprint
id	arxiv_https___arxiv_org_abs_2306_08321
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Nonparametric regression using over-parameterized shallow ReLU neural networks Yang, Yunfei Zhou, Ding-Xuan Machine Learning Statistics Theory It is shown that over-parameterized neural networks can achieve minimax optimal rates of convergence (up to logarithmic factors) for learning functions from certain smooth function classes, if the weights are suitably constrained or regularized. Specifically, we consider the nonparametric regression of estimating an unknown $d$-variate function by using shallow ReLU neural networks. It is assumed that the regression function is from the Hölder space with smoothness $α<(d+3)/2$ or a variation space corresponding to shallow neural networks, which can be viewed as an infinitely wide neural network. In this setting, we prove that least squares estimators based on shallow neural networks with certain norm constraints on the weights are minimax optimal, if the network width is sufficiently large. As a byproduct, we derive a new size-independent bound for the local Rademacher complexity of shallow ReLU neural networks, which may be of independent interest.
title	Nonparametric regression using over-parameterized shallow ReLU neural networks
topic	Machine Learning Statistics Theory
url	https://arxiv.org/abs/2306.08321

Similar Items