Usage Model: Training and Prediction

Training

Given a (p+1)-dimensional tensor q of size n1 x n2 x ... x np x np+1 where each element is a sample, a (p+1)-dimensional tensor y of size n1 x n2 x ... x np x np+1 where each element is a stated result for the corresponding sample, and a neural network that consists of n layers, the problem is to train the neural network. For more details, see Training and Prediction.

Note

Intel DAAL supports only supervised learning with a known vector of class labels.

The key mechanism used to train a neural network is a backward propagation of errors [Rumelhart86]. During the training stage the algorithm performs forward and backward computations.

The training stage consists of one or several epochs. An epoch is the time interval when the network processes the entire input data set performing several forward passes, backward passes, and updates of weights and biases in the neural network model.

Each epoch consists of several iterations. An iteration is the time interval when the network performs one forward and one backward pass using a part of the input data set called a batch. At each iteration, the optimization solver performs an optimization step and updates weights and biases in the model.

Forward Computation

Follow these steps:

  1. Provide the neural network with the input data for training. You can provide either one sample or a set of samples. The batchSize parameter specifies the number of simultaneously processed samples.

  2. Compute xi+1 = f(xi), where:

    • xi is the input data for the layer i

    • xi+1 is the output value of the layer i

    • fi(x) is the function corresponding to the layer i.

    • i = 0, …, n -1 is the index of the layer

    For some layers, the computation can also use weights w and biases b. For more details, see Layers.

  3. Compute an error as the result of a loss layer: e = floss (xn-1, y). For available loss layers, see Layers.

Note

In the descriptions of specific forward layers in the Layers section, the preceding layer for the layer i is the layer i-1.

Backward Computation

Follow these steps:

  1. Compute the input gradient for the penultimate layer as the gradient of the loss layer gradn = ∇floss (xn-1, y).

  2. Compute gradi = ∇fi(xi)*gradi+1, where:

    • gradi is the gradient obtained at the i-the layer

    • gradi+1 is the gradient obtained at the (i+1)-the layer

    • i = n - 1, ..., 0

    For some layers, the computation can also use weights w and biases b. For more details, see Layers.

  3. Apply one of the optimization methods to the results of the previous step. Compute w, b = optimizationSolver (w, b, grado), where w = (wo, w1, ..., wn-1), b = (bo, b1, ..., bn-1). For available optimization solver algorithms, see Optimization Solvers.

As a result of the training stage, you receive the trained model with the optimum set of weights and biases. Use the getPredictionModel method to get the model you can use at the prediction stage. This method performs the following steps to produce the prediction model from the training model:

  1. Clones all the forward layers of the training model except the loss layer.
  2. Replaces the loss layer with the layer returned by the getLayerForPrediction method of the forward loss layer. For example, the loss softmax cross-entropy forward layer is replaced with the softmax forward layer.

Note

In the descriptions of specific backward layers in the Layers section, the preceding layer for the layer i is the layer i+1.

Prediction

Given the trained network (with optimum set of weights w and biases b) and a new (p+1)-dimensional tensor x of size n1 x n2 x ... x np x np+1, the algorithm determines the result for each sample (one of elements of the tensor y). Unlike the training stage, during prediction the algorithm performs the forward computation only.

For more complete information about compiler optimizations, see our Optimization Notice.
Select sticky button color: 
Orange (only for download buttons)