Saved in:
Bibliographic Details
Main Authors: Qi, Xuan, Wei, Yi, Yu, Fanqi, Shen, Furao, Murino, Vittorio, Beyan, Cigdem
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.04946
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Batch normalization (BN) is central to modern deep networks, but its effect on the realized function during training remains less understood than its optimization benefits. We study training-time BN in continuous piecewise-affine (CPA) networks through the geometry of switching hyperplanes and the induced affine-region partition. Conditioned on a mini-batch, we show that BN defines for each neuron a reference hyperplane through the batch centroid, and that breakpoint-switching hyperplanes are parallel translates whose offsets are expressed in batch-standardized coordinates and are independent of the raw bias. This yields an exact criterion for when a switching hyperplane intersects a local $\ell_\infty$ window and motivates a local region-density functional based on exact affine-region counts. Under explicit sufficient conditions, we show that BN increases expected local partition refinement in ReLU and more general piecewise-affine networks, and that this mechanism transfers locally through depth inside parent affine regions where the upstream representation map is an affine embedding. These results provide a function-level geometric account of training-time BN as a batch-conditional recentering mechanism near the data.