Distributed Processing
The distributed processing mode assumes that the data set R is split in
nblocks
blocks across computation nodes.Parameters
In the distributed processing mode, initialization of item factors for the implicit ALS algorithm has the following parameters:
Parameter | Default Value | Description |
---|---|---|
algorithmFPType | float | The floating-point type that the algorithm uses for intermediate computations. Can be float or double . |
method | fastCSR | Performance-oriented computation method for CSR numeric tables, the only method supported by the algorithm. |
nFactors | 10 | The total number of factors. |
fullNUsers | 0 | The total number of users m . |
partition | Not applicable | A numeric table of size either
nblocks is the number of input data parts, and the i -th element contains the offset
of the transposed i -th data part to be computed by the initialization algorithm. |
engine | SharePtr< engines:: mt19937:: Batch>() | Pointer to the random number generator engine that is used internally at the initialization step. |
To initialize the implicit ALS algorithm in the distributed processing mode, use the one-step process illustrated by the following diagram for
:

Step 1 - on Local Nodes

Input
In the distributed processing mode, initialization of item factors for the implicit ALS algorithm accepts the input described below.
Pass the
Input ID
as a parameter to the methods that provide input for your algorithm.
For more details, see Algorithms.Input ID | Input |
---|---|
dataColumnSlice | An
The input should be an object of CSRNumericTable class. |
Output
In the distributed processing mode, initialization of item factors for the implicit ALS algorithm calculates the results described below.
Pass the
Partial Result ID
as a parameter to the methods that access the results of your algorithm.
Partial results that correspond to the outputOfInitForComputeStep3
and offsets
Partial Result IDs
should be transferred to Step 3 of the distributed ALS training algorithm.Output of Initialization for Computing Step 3 (
outputOfInitForComputeStep3
) is a key-value data collection
that maps components of the partial model on the i
-th node to all local nodes.
Keys in this data collection are indices of the nodes and the value that corresponds to each key i
is a numeric table that contains indices of the factors of the items to be transferred to the i
-th node
on Step 3 of the distributed ALS training algorithm.User Offsets (
that contains the value of the starting offset of the user factors stored on the
offsets
) is a key-value data collection,
where the keys are indices of the nodes and the value that correspond to the key i
is a numeric table of size
i
-th node.For more details, see Algorithms.
Partial Result ID | Result |
---|---|
partialModel | The model with initialized item factors. The result can only be an object of the PartialModel class. |
outputOfInitForComputeStep3 | A key-value data collection that maps components of the partial model to the local nodes. |
offsets | A key-value data collection of size nblocks that holds the starting offsets of the factor indices on each node. |
outputOfStep1ForStep2 | A key-value data collection of size nblocks that contains the parts of the input numeric table:
j -th element of this collection is a numeric table of size
partition parameter. |
Step 2 - on Local Nodes

Input
This step uses the results of the previous step.
Input ID | Input |
---|---|
inputOfStep2FromStep1 | A key-value data collection of size nblocks that contains the parts of the input data set:
i -th element of this collection is a numeric table of size
|
Output
In this step, implicit ALS initialization calculates the partial results described below.
Pass the
Partial Result ID
as a parameter to the methods that access the results of your algorithm.
Partial results that correspond to the outputOfInitForComputeStep3
and offsets
Partial Result IDs
should be transferred to Step 3 of the distributed ALS training algorithm.Output of Initialization for Computing Step 3 (
outputOfInitForComputeStep3
) is a key-value data collection
that maps components of the partial model on the i
-th node to all local nodes.
Keys in this data collection are indices of the nodes and the value that corresponds to each key i
is a numeric table that contains indices of the user factors to be transferred to the i-th node
on Step 3 of the distributed ALS training algorithm.Item Offsets (
that contains the value of the starting offset of the item factors stored on the
offsets
) is a key-value data collection,
where the keys are indices of the nodes and the value that correspond to the key i
is a numeric table of size
i
-th node.For more details, see Algorithms.
Partial Result ID | Result |
---|---|
dataRowSlice | An
j -th node gets
R . |
outputOfInitForComputeStep3 | A key-value data collection that maps components of the partial model to the local nodes. |
offsets | A key-value data collection of size nblocks that holds the starting offsets of the factor indices on each node. |