Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Dalm, Sander, Offergeld, Joshua, Ahmad, Nasir, van Gerven, Marcel
Format: Preprint
Veröffentlicht: 2024
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2405.02385
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Inhaltsangabe:
  • The backpropagation algorithm remains the dominant and most successful method for training deep neural networks (DNNs). At the same time, training DNNs at scale comes at a significant computational cost and therefore a high carbon footprint. Converging evidence suggests that input decorrelation may speed up deep learning. However, to date, this has not yet translated into substantial improvements in training efficiency in large-scale DNNs. This is mainly caused by the challenge of enforcing fast and stable network-wide decorrelation. Here, we show for the first time that much more efficient training of deep convolutional neural networks is feasible by embracing decorrelated backpropagation as a mechanism for learning. To achieve this goal we made use of a novel algorithm which induces network-wide input decorrelation using minimal computational overhead. By combining this algorithm with careful optimizations, we achieve a more than two-fold speed-up and higher test accuracy compared to backpropagation when training several deep residual networks. This demonstrates that decorrelation provides exciting prospects for efficient deep learning at scale.