Saved in:
Bibliographic Details
Main Authors: Pallero, Eduardo Lavin, Ruiz-Garcia, Miguel
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.10690
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Dynamical loss functions are derived from standard loss functions used in supervised classification tasks, but are modified so that the contribution from each class periodically increases and decreases. These oscillations globally alter the loss landscape without affecting the global minima. In this paper, we demonstrate how to transform cross-entropy and mean squared error into dynamical loss functions. We begin by discussing the impact of increasing the size of the neural network or the learning rate on the depth and sharpness of the minima that the system explores. Building on this intuition, we propose several versions of dynamical loss functions and use a simple classification problem where we can show how they significantly improve validation accuracy for networks of varying sizes. Finally, we explore how the landscape of these dynamical loss functions evolves during training, highlighting the emergence of instabilities that may be linked to edge-of-instability minimization.