Gespeichert in:
| 1. Verfasser: | |
|---|---|
| Format: | Recurso digital |
| Sprache: | |
| Veröffentlicht: |
Zenodo
2025
|
| Schlagworte: | |
| Online-Zugang: | https://doi.org/10.5281/zenodo.17996325 |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Inhaltsangabe:
- <p>Abstract</p> <p>Natural Gradient descent is a second-order optimization method that incorporates the Fisher information matrix to account for the geometry of probability distributions. While its theoretical foundation in information geometry is well-established, empirical validation of its key properties—particularly coordinate invariance—has been limited. We present three complementary experiments that validate Natural Gradient's fundamental characteristics in softmax regression: (1) local efficiency advantage under matched Kullback-Leibler divergence, (2) performance under strict KL-controlled learning, and (3) coordinate invariance under reparameterization. Our results demonstrate that Natural Gradient achieves 1.98% superior loss reduction per step, maintains performance parity with SGD under exact KL budgeting in linear models, and exhibits machine-precision coordinate invariance—showing 1.9 billion-fold better consistency than SGD across parameter rescalings. We introduce a KL-controlled learning protocol enabling fair comparison of optimization methods and validate it through extensive experimentation. These findings provide the first systematic empirical confirmation of Natural Gradient's coordinate invariance property in a controlled setting and establish benchmarks for evaluating geometric optimization methods.</p>