Developer Guide

Contents

Distributed Processing

You can use linear or ridge regression in the distributed processing mode only at the training stage.
This computation mode assumes that the data set is split in
nblocks
blocks across computation nodes.

Training

Algorithm Parameters
The following table lists parameters of linear and ridge regressions at the training stage in the distributed processing mode. Some of these parameters or their values are specific to a linear or ridge regression algorithm.
Parameter
Algorithm
Default Value
Description
computeStep
any
Not applicable
The parameter required to initialize the algorithm. Can be:
  • step1Local
    - the first step, performed on local nodes
  • step2Master
    - the second step, performed on a master node
algorithmFPType
any
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
linear regression
defaultDense
Available methods for linear regression training:
  • defaultDense
    - the normal equations method
  • qrDense
    - the method based on QR decomposition
ridge regression
Default computation method used by the ridge regression. The only method supported at the training stage is the normal equations method.
ridgeParameters
ridge regression
Numeric table of size 1 x 1 that contains the default ridge parameter equal to 1.
The numeric table of size 1 x
k
(
k
is the number of dependent variables) or 1 x 1. The contents of the table depend on its size:
  • size = 1 x
    k
    : values of the ridge parameters
    λ
    j
    for
    j
    = 1, …,
    k
    .
  • size = 1 x 1: the value of the ridge parameter for each dependent variable
    λ
    1
    = ... =
    λ
    k
    .
This parameter can be an object of any class derived from
NumericTable
, except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
interceptFlag
any
true
A flag that indicates a need to compute
β
0
j
.
Use the two-step computation schema for linear or ridge regression training in the distributed processing mode, as illustrated below:
Step 1 - on Local Nodes
Linear Regression Training, Distributed Processing, Workflow Step 1
In this step, linear or ridge regression training accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
data
Pointer to the
n
i
x
p
numeric table that represents the
i
-th data block on the local node. This table can be an object of any class derived from
NumericTable
.
dependentVariables
Pointer to the
n
i
x
k
numeric table with responses associated with the
i
-th data block. This table can be an object of any class derived from
NumericTable
.
In this step, linear or ridge regression training calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
partialModel
Pointer to the partial linear regression model that corresponds to the
i
-th data block. The result can only be an object of the
Model
class.
Step 2 - on Master Node
Linear Regression Training, Distributed Processing, Workflow Step 2
In this step, linear or ridge regression training accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
partialModels
A collection of partial models computed on local nodes in Step 1. The collection contains objects of the
Model
class.
In this step, linear or ridge regression training calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
model
Pointer to the linear or ridge regression model being trained. The result can only be an object of the
Model
class.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804