Developer Guide

Contents

Training

Algorithm Parameters

Neural network training in the distributed processing mode has the following parameters:
Parameter
Default Value
Description
computeStep
Not applicable
The parameter required to initialize the algorithm. Can be:
  • step1Local
    - the first step, performed on local nodes
  • step2Master
    - the second step, performed on a master node
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
Performance-oriented computation method.
batchSize
128
The number of samples, simultaneously used for training on each node. The values of this parameter must be the same on all nodes for synchronous computations and may be different for asynchronous computations. This parameter is used by algorithms running on local nodes.
optimization
Solver
SharedPtr<optimization_solver:: sgd::Batch<float, defaultDense > >()
The optimization procedure used at the training stage. This parameter is used by the algorithm running on the master node.

Initialization

Neural Network Training Distributed Processing Initialization
Initialize batch sizes depending on whether the computation is synchronous or asynchronous:
  • In the synchronous case, set
    batchSize1
    =
    batchSize2
    =
    batchSize3
    .
  • In the asynchronous case, you can use different batch sizes.
Use the two-step computation schema for neural network training in the distributed processing mode, as illustrated below.

Step1 - on Local Nodes

Neural Network Training Distributed Processing Step 1 Local Workflow
In this step, the neural network training algorithm accepts the following input. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms .
Input ID
Input
data
Pointer to the tensor of size
batchSize
x
n
2
x ... x
n
p
that represents the
i
-th data block on the local node. This input can be an object of any class derived from
Tensor
.
groundTruth
Pointer to the tensor of size
batchSize
that stores
i
-th stated results associated with the input data. This input can be an object of any class derived from
Tensor
.
inputModel
The neural network model updated on the master node. This input can only be an object of the
Model
class.
Update the model parameters using the
setWeightsAndBiases()
method of
inputModel
after the master node delivers them to local nodes.
In this step, the neural network training algorithm calculates partial results described below. Pass the Partial Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms .
PartialResultID
Partial Result
derivatives
Pointer to the numeric table of size
weightsAndBiasesSize
x 1, where
weightsAndBiasesSize
is the number of model parameters. By default this result is an object of the
HomogenNumericTable
class, but you can define this result as an object of any class derived from
NumericTable
except
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
batchSize
Pointer to the numeric table of size 1 x 1 that contains the number of samples simultaneously used for training on each node. By default this partial result is an object of the
HomogenNumericTable
class, but you can define this numeric table as an object of any class derived from
NumericTable
except
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.

Step2 - on Master Node

In this step, the neural network training algorithm accepts the following input. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms .
Input ID
Input
partialResults
Results computed on local nodes in Step 1 (
derivatives
and
batchSize
). This input contains objects of the
PartialResult
class.
You can update the model on the master node incrementally, upon availability of partial results from local nodes, but be aware that in synchronous computations, the master node sends out the updated model only after it processes partial results from all local nodes.
In this step, the neural network training algorithm calculates the results described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms .
Result ID
Result
model
Trained model with a set of weights and biases. This result can only be an object of the
Model
class.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804