Developer Guide

Contents

Training

The distributed processing mode assumes that the data set is split in
nblocks
blocks across computation nodes.

Algorithm Parameters

At the training stage, implicit ALS recommender in the distributed processing mode has the following parameters:
Parameter
Default Value
Description
computeStep
Not applicable
The parameter required to initialize the algorithm. Can be:
  • step1Local
    - the first step, performed on local nodes
  • step2Master
    - the second step, performed on a master node
  • step3Local
    - the third step, performed on local nodes
  • step4Local
    - the fourth step, performed on local nodes
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
fastCSR
Performance-oriented computation method for CSR numeric tables, the only method supported by the algorithm.
nFactors
10
The total number of factors.
maxIterations
5
The number of iterations.
alpha
40
The rate of confidence.
lambda
0.01
The parameter of the regularization.
preferenceThreshold
0
Threshold used to define preference values.
0 is the only threshold supported so far.

Computation Process

At each iteration, the implicit ALS training algorithm alternates between re-computing user factors (
X
) and item factors (
Y
). These computations split each iteration into the following parts:
  1. Re-compute all user factors using the input data sets and item factors computed previously.
  2. Re-compute all item factors using input data sets in the transposed format and item factors computed previously.
Each part includes four steps executed either on local nodes or on the master node, as explained below and illustrated by graphics for
nblocks
=3. The main loop of the implicit ALS training stage is executed on the master node.
The following pseudocode illustrates the entire computation:

Step 1 - on Local Nodes

This step works with the matrix:
  • Y
    T
    in part 1 of the iteration
  • X
    in part 2 of the iteration
Parts of this matrix are used as input partial models.
Implicit Alternating Least Squares Training, Distributed Processing Step1 Workflow
In this step, implicit ALS recommender training accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
partialModel
Partial model computed on the local node.
In this step, implicit ALS recommender training calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
outputOfStep1ForStep2
Pointer to the
f
x
f
numeric table with the sum of numeric tables calculated in Step 1.

Step 2 - on Master Node

This step uses local partial results from Step 1 as input.
Implicit Alternating Least Squares Training, Distributed Processing Step2 Workflow
In this step, implicit ALS recommender training accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
inputOfStep2FromStep1
A collection of numeric tables computed on local nodes in Step 1. The collection may contain objects of any class derived from
NumericTable
except the
PackedTriangularMatrix
class with the
lowerPackedTriangularMatrix
layout.
In this step, implicit ALS recommender training calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
outputOfStep2ForStep4
Pointer to the
f
x
f
numeric table with merged cross-products.

Step 3 - on Local Nodes

On each node
i
, this step uses results of the previous steps and requires that you provide two extra matrices
Offset Table
i
and
Input of Step 3 From Init
i
computed at the initialization stage of the algorithm
.
The only element of the
Offset Table
i
table refers to the:
  • i
    -th element of the
    offsets
    collection from the step 2 of the distributed initialization algorithm in part 1 of the iteration
  • i
    -th element of the
    offsets
    collection from the step 1 of the distributed initialization algorithm in part 2 of the iteration
The
Input Of Step 3 From Init
is a key-value data collection that refers to the
outputOfInitForComputeStep3
output of the initialization stage:
  • Output of the step 1 of the initialization algorithm in part 1 of the iteration
  • Output of the step 2 of the initialization algorithm in part 2 of the iteration
Implicit Alternating Least Squares Training, Distributed Processing Step3 Workflow
In this step, implicit ALS recommender training accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
partialModel
Partial model computed on the local node.
offset
Numeric table of size 1x1 that holds the global index of the starting row of the input partial model. A part of the key-value data collection
offsets
computed at the initialization stage of the algorithm.
In this step, implicit ALS recommender training calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
outputOfStep3ForStep4
A key-value data collection that contains partial models to be used in Step 4. Each element of the collection contains an object of the
PartialModel
class.

Step4 - on Local Nodes

This step uses the results of the previous steps and parts of the following matrix in the transposed format:
  • X
    in part 1 of the iteration
  • Y
    T
    in part 2 of the iteration
The results of the step are the re-computed parts of this matrix.
Implicit Alternating Least Squares Training, Distributed Processing Step4 Workflow
In this step, implicit ALS recommender training accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
partialModels
A key-value data collection with partial models that contain user factors/item factors computed in Step 3. Each element of the collection contains an object of the
PartialModel
class.
partialData
Pointer to the CSR numeric table that holds the
i
-th part of
the input data set
, assuming that the data is divided by users/items.
inputOfStep4FromStep2
Pointer to the
f
x
f
numeric table computed in Step 2.
In this step, implicit ALS recommender training calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
outputOfStep4ForStep1
Pointer to the partial implicit ALS model that corresponds to the
i
-th data block. The partial model stores user factors/item factors.
outputOfStep4ForStep3

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804