Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Wentao, Zhang, Yutong, Zhu, Yifan, Mo, Wentao
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.02591
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915978095362048
author	Zhang, Wentao Zhang, Yutong Zhu, Yifan Mo, Wentao
author_facet	Zhang, Wentao Zhang, Yutong Zhu, Yifan Mo, Wentao
contents	The efficacy of deep neural networks is heavily reliant on the design of non-linear activation functions, yet existing approaches often struggle to balance optimization stability with computational efficiency. While piecewise linear functions offer inference speed, they suffer from optimization instability due to non-differentiability at the origin, whereas smooth counterparts typically incur significant computational overhead through their reliance on transcendental operations. To address these limitations, this paper proposes a general smoothing framework based on constructive approximation theory and introduces the Bernstein Linear Unit (BerLU). This novel activation function utilizes Bernstein polynomials to construct a differentiable quadratic transition region that effectively eliminates singularities while maintaining a piecewise linear structure. Theoretical analysis demonstrates that the proposed method guarantees strictly continuous differentiability and a non-expansive Lipschitz constant of one, which ensures stable gradient propagation and prevents the gradient explosion problems common in deep architectures. Comprehensive empirical evaluations across representative Vision Transformer and Convolutional Neural Network architectures confirm that this approach consistently outperforms state-of-the-art baselines on standard image classification benchmarks while delivering superior computational and memory efficiency.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_02591
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions Zhang, Wentao Zhang, Yutong Zhu, Yifan Mo, Wentao Artificial Intelligence The efficacy of deep neural networks is heavily reliant on the design of non-linear activation functions, yet existing approaches often struggle to balance optimization stability with computational efficiency. While piecewise linear functions offer inference speed, they suffer from optimization instability due to non-differentiability at the origin, whereas smooth counterparts typically incur significant computational overhead through their reliance on transcendental operations. To address these limitations, this paper proposes a general smoothing framework based on constructive approximation theory and introduces the Bernstein Linear Unit (BerLU). This novel activation function utilizes Bernstein polynomials to construct a differentiable quadratic transition region that effectively eliminates singularities while maintaining a piecewise linear structure. Theoretical analysis demonstrates that the proposed method guarantees strictly continuous differentiability and a non-expansive Lipschitz constant of one, which ensures stable gradient propagation and prevents the gradient explosion problems common in deep architectures. Comprehensive empirical evaluations across representative Vision Transformer and Convolutional Neural Network architectures confirm that this approach consistently outperforms state-of-the-art baselines on standard image classification benchmarks while delivering superior computational and memory efficiency.
title	Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions
topic	Artificial Intelligence
url	https://arxiv.org/abs/2605.02591

Similar Items