Saved in:
Bibliographic Details
Main Author: Ayena, Koffi O.
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2512.12132
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • We present SiLU network constructions whose approximation efficiency depends critically on proper hyperparameter tuning. For the square function $x^2$, with optimally chosen shift $a$ and scale $β$, we achieve approximation error $\varepsilon$ using a two-layer network of constant width, where weights scale as $β^{\pm k}$ with $k = \mathcal{O}(\ln(1/\varepsilon))$. We then extend this approach through functional composition to Sobolev spaces, we obtain networks with depth $\mathcal{O}(1)$ and $\mathcal{O}(\varepsilon^{-d/n})$ parameters under optimal hyperparameters settings. Our work highlights the trade-off between architectural depth and activation parameter optimization in neural network approximation theory.