Inhaltsangabe: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Li, Yuxuan, Guo, Yunhui
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Machine Learning
Online-Zugang:	https://arxiv.org/abs/2411.19819
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Inhaltsangabe:

Architecture plays an important role in deciding the performance of deep neural networks. However, the search for the optimal architecture is often hindered by the vast search space, making it a time-intensive process. Recently, a novel approach known as training-free neural architecture search (NAS) has emerged, aiming to discover the ideal architecture without necessitating extensive training. Training-free NAS leverages various indicators for architecture selection, including metrics such as the count of linear regions, the density of per-sample losses, and the stability of the finite-width Neural Tangent Kernel (NTK) matrix. Despite the competitive empirical performance of current training-free NAS techniques, they suffer from certain limitations, including inconsistent performance and a lack of deep understanding. In this paper, we introduce GradAlign, a simple yet effective method designed for inferring model performance without the need for training. At its core, GradAlign quantifies the extent of conflicts within per-sample gradients during initialization, as substantial conflicts hinder model convergence and ultimately result in worse performance. We evaluate GradAlign against established training-free NAS methods using standard NAS benchmarks, showing a better overall performance. Moreover, we show that the widely adopted metric of linear region count may not suffice as a dependable criterion for selecting network architectures during at initialization.

Ähnliche Einträge