Stochastic Gradient Descent Algorithm

The stochastic gradient descent (SGD) algorithm is a special case of an iterative solver. For more details, see Iterative Solver.

The following computation methods are available in Intel DAAL for the stochastic gradient descent algorithm:

  • Mini-batch.
  • A special case of mini-batch used by default.
  • Momentum

Mini-batch method

The mini-batch method (miniBatch) of the stochastic gradient descent algorithm [Mu2014] follows the algorithmic framework of an iterative solver with an empty set of intrinsic parameters of the algorithm S t , algorithm-specific transformation T defined for the learning rate sequence {η t } t=1,...,nIterations , conservative sequence {γ t } t=1,...,nIterations and the number of iterations in the internal loop L, algorithm-specific vector U and power d of Lebesgue space defined as follows:



For l from 1 until L:

  1. Update the function argument



  2. Compute the gradient



Convergence check:

Default method

The default method (defaultDense) is a particular case of the mini-batch method with the batch size b=1, L=1, and conservative sequence

Momentum method

The momentum method (momentum) of the stochastic gradient descent algorithm [Rumelhart86] follows the algorithmic framework of an iterative solver with the set of intrinsic parameters S t , algorithm-specific transformation T defined for the learning rate sequence {η t } t=1,...,nIterations and momentum parameter μ ∈ [0,1], and algorithm-specific vector U and power d of Lebesgue space defined as follows:







For the momentum method of the SGD algorithm, the set of intrinsic parameters S t only contains the last update vector v t .

Convergence check:

For more complete information about compiler optimizations, see our Optimization Notice.
Select sticky button color: 
Orange (only for download buttons)