Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: HIDEKI
Format: Recurso digital
Sprache:
Veröffentlicht: Zenodo 2025
Schlagworte:
Online-Zugang:https://doi.org/10.5281/zenodo.17996325
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Inhaltsangabe:
  • <p>Abstract</p> <p>Natural Gradient descent is a second-order optimization method that incorporates the Fisher information matrix to account for the geometry of probability distributions. While its theoretical foundation in information geometry is well-established, empirical validation of its key properties—particularly coordinate invariance—has been limited. We present three complementary experiments that validate Natural Gradient's fundamental characteristics in softmax regression: (1) local efficiency advantage under matched Kullback-Leibler divergence, (2) performance under strict KL-controlled learning, and (3) coordinate invariance under reparameterization. Our results demonstrate that Natural Gradient achieves 1.98% superior loss reduction per step, maintains performance parity with SGD under exact KL budgeting in linear models, and exhibits machine-precision coordinate invariance—showing 1.9 billion-fold better consistency than SGD across parameter rescalings. We introduce a KL-controlled learning protocol enabling fair comparison of optimization methods and validate it through extensive experimentation. These findings provide the first systematic empirical confirmation of Natural Gradient's coordinate invariance property in a controlled setting and establish benchmarks for evaluating geometric optimization methods.</p>