Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Abeykoon, Chathurika S, Beknazaryan, Aleksandr, Sang, Hailin
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Machine Learning
Online-Zugang:	https://arxiv.org/abs/2504.19351
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866913833631612928
author	Abeykoon, Chathurika S Beknazaryan, Aleksandr Sang, Hailin
author_facet	Abeykoon, Chathurika S Beknazaryan, Aleksandr Sang, Hailin
contents	Recent studies observed a surprising concept on model test error called the double descent phenomenon, where the increasing model complexity decreases the test error first and then the error increases and decreases again. To observe this, we work on a two layer neural network model with a ReLU activation function designed for binary classification under supervised learning. Our aim is to observe and investigate the mathematical theory behind the double descent behavior of model test error for varying model sizes. We quantify the model size by the ratio of number of training samples to the dimension of the model. Due to the complexity of the empirical risk minimization procedure, we use the Convex Gaussian Min Max Theorem to find a suitable candidate for the global training loss.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_19351
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	The Double Descent Behavior in Two Layer Neural Network for Binary Classification Abeykoon, Chathurika S Beknazaryan, Aleksandr Sang, Hailin Machine Learning Recent studies observed a surprising concept on model test error called the double descent phenomenon, where the increasing model complexity decreases the test error first and then the error increases and decreases again. To observe this, we work on a two layer neural network model with a ReLU activation function designed for binary classification under supervised learning. Our aim is to observe and investigate the mathematical theory behind the double descent behavior of model test error for varying model sizes. We quantify the model size by the ratio of number of training samples to the dimension of the model. Due to the complexity of the empirical risk minimization procedure, we use the Convex Gaussian Min Max Theorem to find a suitable candidate for the global training loss.
title	The Double Descent Behavior in Two Layer Neural Network for Binary Classification
topic	Machine Learning
url	https://arxiv.org/abs/2504.19351

Ähnliche Einträge