Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Abeykoon, Chathurika S, Beknazaryan, Aleksandr, Sang, Hailin
Format: Preprint
Veröffentlicht: 2025
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2504.19351
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866913833631612928
author Abeykoon, Chathurika S
Beknazaryan, Aleksandr
Sang, Hailin
author_facet Abeykoon, Chathurika S
Beknazaryan, Aleksandr
Sang, Hailin
contents Recent studies observed a surprising concept on model test error called the double descent phenomenon, where the increasing model complexity decreases the test error first and then the error increases and decreases again. To observe this, we work on a two layer neural network model with a ReLU activation function designed for binary classification under supervised learning. Our aim is to observe and investigate the mathematical theory behind the double descent behavior of model test error for varying model sizes. We quantify the model size by the ratio of number of training samples to the dimension of the model. Due to the complexity of the empirical risk minimization procedure, we use the Convex Gaussian Min Max Theorem to find a suitable candidate for the global training loss.
format Preprint
id arxiv_https___arxiv_org_abs_2504_19351
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle The Double Descent Behavior in Two Layer Neural Network for Binary Classification
Abeykoon, Chathurika S
Beknazaryan, Aleksandr
Sang, Hailin
Machine Learning
Recent studies observed a surprising concept on model test error called the double descent phenomenon, where the increasing model complexity decreases the test error first and then the error increases and decreases again. To observe this, we work on a two layer neural network model with a ReLU activation function designed for binary classification under supervised learning. Our aim is to observe and investigate the mathematical theory behind the double descent behavior of model test error for varying model sizes. We quantify the model size by the ratio of number of training samples to the dimension of the model. Due to the complexity of the empirical risk minimization procedure, we use the Convex Gaussian Min Max Theorem to find a suitable candidate for the global training loss.
title The Double Descent Behavior in Two Layer Neural Network for Binary Classification
topic Machine Learning
url https://arxiv.org/abs/2504.19351