Batch Normalization Forward Layer

The forward batch normalization layer [Ioffe2015] normalizes x i 1...i p from the input XR n 1 x n 2 x ... x n p for the dimension k ∈ {1, ... p} and then scales and shifts the result of the normalization using the provided weights and biases as follows:

where the following characteristics are computed for the input X:

  • means

  • standard deviation

    with variance

  • a constant ε to improve the numerical stability

The weights and biases are learned, as well as the rest model parameters.

Problem Statement

Given a p-dimensional tensor XR n 1 x n 2 x ... x n p , the problem is to compute the p-dimensional tensor YR n 1 x n 2 x ... x n p :

where:

  • mean

  • variance

  • standard deviation

  • weights

  • biases

At the model training stage, along with the normalizing, the layer computes the population mean and variance using the exponential moving average with smoothing factor α ∈ [0,1] applied to the mini-batch means and variances.

For more complete information about compiler optimizations, see our Optimization Notice.
Select sticky button color: 
Orange (only for download buttons)