Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Masalskikh, Aleksandr
Format:	Recurso digital
Language:	English
Published:	Zenodo 2026
Subjects:	Machine Learning Neural Networks, Computer activation functions Artificial Intelligence
Online Access:	https://doi.org/10.5281/zenodo.19232218
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

<p>We introduce Steklov activations, a piecewise-polynomial activation family derived from B-spline antiderivatives. Parameterized by order r (smoothness) and scale α (transition width), they produce exact zero output and gradient outside a compact support. At α=2 the activation approximates GELU (sup error <0.0091); at α=6 it is exactly HardSwish. On image classification (MNIST, CIFAR-10, CIFAR-100 across LeNet-5, ResNet-18, and WideResNet-28-10), Steklov achieves the highest accuracy on all benchmarks. On language modeling (GPT-2 124M/354M, LLaMA-style 105M), it matches GELU and improves over SiLU. The compact support induces tunable neuron inactivity (3–83%) that is stable across data splits and distributions. Pruning inactive neurons removes 7–11% of parameters with negligible quality loss; a Triton kernel then delivers 3–6% faster inference than unpruned GELU.</p>

Similar Items