- Data based
- Model based
- Hybrid (data+model based)
Data Based Parallelization
- The training data is split across local nodes.
- Instances of the same model are used by local nodes to compute local derivatives.
- The master node updates the weights and biases parameters of the model using the local derivatives and delivers them back to the local nodes.
- Synchronous.The master node updates the model only after all local nodes deliver the local derivatives for a given iteration of the training.
- Asynchronous.The master node:
- Immediately sends the latest version of the model to the local node that delivered the local derivatives.
- Updates the model as soon as the master node accumulates a sufficient amount of partial results. This amount is defined by the requirements of the application.
- Initialize the neural network model using theinitialize()method on the master node and propagate the model to local nodes.
- Run the training algorithm on local nodes as described in the Usage Model: Training and Prediction > Training section with the following specifics of the distributed computation mode:
See the figure below to visualize ani-th iteration, corresponding to thei-th data block. After the computations for thei-th data block on a local node are finished, send the derivatives of local weights and biases to the master node.The training algorithm on local nodes does not require an optimization solver.
- Provide eachj-th node of the neural network with the local data set oflocalDataSizejsize.
- Specify the requiredbatchSizejparameter.
- Split the data set on a local node intolocalDataSizej/batchSizejdata blocks, each to be processed by the local algorithm separately.
- ThebatchSizejparameters andlocalDataSizejparameters must be the same on all local nodes for synchronous computations and can be different for asynchronous computations.
- Run the training algorithm on the master node by providing the local derivatives from all local nodes. The algorithm uses the optimization solver provided in itsoptimizationSolverparameter. For available algorithms, see Optimization Solvers . After the computations are completed, send the updated weights and biases parameters of the model to all local nodes.You can get the latest version of the model by calling thefinalizeCompute()method after each run of the training algorithm on the master or local node.