Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Pan, Leyan, Cao, Xinyuan
Formato:	Preprint
Publicado:	2023
Materias:	Machine Learning
Acceso en línea:	https://arxiv.org/abs/2309.04644
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866914936952717312
author	Pan, Leyan Cao, Xinyuan
author_facet	Pan, Leyan Cao, Xinyuan
contents	Neural Collapse (NC) is a geometric structure recently observed at the terminal phase of training deep neural networks, which states that last-layer feature vectors for the same class would "collapse" to a single point, while features of different classes become equally separated. We demonstrate that batch normalization (BN) and weight decay (WD) critically influence the emergence of NC. In the near-optimal loss regime, we establish an asymptotic lower bound on the emergence of NC that depends only on the WD value, training loss, and the presence of last-layer BN. Our experiments substantiate theoretical insights by showing that models demonstrate a stronger presence of NC with BN, appropriate WD values, lower loss, and lower last-layer feature norm. Our findings offer a novel perspective in studying the role of BN and WD in shaping neural network features.
format	Preprint
id	arxiv_https___arxiv_org_abs_2309_04644
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay Pan, Leyan Cao, Xinyuan Machine Learning Neural Collapse (NC) is a geometric structure recently observed at the terminal phase of training deep neural networks, which states that last-layer feature vectors for the same class would "collapse" to a single point, while features of different classes become equally separated. We demonstrate that batch normalization (BN) and weight decay (WD) critically influence the emergence of NC. In the near-optimal loss regime, we establish an asymptotic lower bound on the emergence of NC that depends only on the WD value, training loss, and the presence of last-layer BN. Our experiments substantiate theoretical insights by showing that models demonstrate a stronger presence of NC with BN, appropriate WD values, lower loss, and lower last-layer feature norm. Our findings offer a novel perspective in studying the role of BN and WD in shaping neural network features.
title	Towards Understanding Neural Collapse: The Effects of Batch Normalization and Weight Decay
topic	Machine Learning
url	https://arxiv.org/abs/2309.04644

Ejemplares similares