Computation

The adaptive subgradient (AdaGrad) method is a special case of an iterative solver. For parameters, input, and output of iterative solvers, see Iterative Solver > Computation.

Algorithm Input

In addition to the input of the iterative solver, the AdaGrad method accepts the following optional input:

OptionalDataID

Input

gradientSquareSum

Numeric table of size p x 1 with the values of Gt. Each value is an accumulated sum of squares of coordinate values of a corresponding gradient.

Algorithm Parameters

In addition to parameters of the iterative solver, the AdaGrad method has the following parameters:

Parameter

Default Value

Description

algorithmFPType

float

The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

method

defaultDense

Default performance-oriented computation method.

batchIndices

NULL

Numeric table of size nIterations x batchSize for the defaultDense method that represents 32-bit integer indices of terms in the objective function. If no indices are provided, the algorithm generates random indices.

batchSize

128

Number of batch indices to compute the stochastic gradient. If batchSize equals the number of terms in the objective function, no random sampling is performed, and all terms are used to calculate the gradient.

The algorithm ignores this parameter if the batchIndices parameter is provided.

learningRate

Numeric table of size 1 x 1 that contains the default step length equal to 0.01.

Numeric table of size 1 x 1 that contains the value of learning rate η.

This parameter can be an object of any class derived from NumericTable, except for PackedTriangularMatrix, PackedSymmetricMatrix, and CSRNumericTable.

degenerateCasesThreshold

1e-08

Value ε needed to avoid degenerate cases when computing square roots.

engine

SharePtr< engines:: mt19937:: Batch>()

Pointer to the random number generator engine that is used internally for generation of 32-bit integer indices of terms in the objective function.

Algorithm Output

In addition to the output of the iterative solver, the AdaGrad method calculates the following optional result:

OptionalDataID

Output

gradientSquareSum

Numeric table of size p x 1 with the values of Gt. Each value is an accumulated sum of squares of coordinate values of a corresponding gradient.

For more complete information about compiler optimizations, see our Optimization Notice.
Select sticky button color: 
Orange (only for download buttons)