Developer Guide

Contents

Distributed Processing

This mode assumes that data set is split in
nblocks
blocks across computation nodes.
PCA computation in the distributed processing mode follows the general schema described in Algorithms.

Algorithm Parameters

The PCA algorithm in the distributed processing mode has the following parameters, depending on the computation method parameter
method
:
Parameter
method
Default Value
Description
computeStep
defaultDense
or
svdDense
Not applicable
The parameter required to initialize the algorithm. Can be:
  • step1Local
    - the first step, performed on local nodes
  • step2Master
    - the second step, performed on a master node
algorithmFPType
defaultDense
or
svdDense
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
Not applicable
defaultDense
Available methods for PCA computation:
  • defaultDense
    - the correlation method
  • svdDense
    - the SVD method
covariance
defaultDense
SharedPtr<covariance::
Distributed <compute
Step, algorithmFPType, covariance::default
Dense> >
The correlation and variance-covariance matrices algorithm to be used for PCA computations with the correlation method. For details, see Correlation and Variance-covariance Matrices. Distributed Processing.

Correlation Method (
defaultDense
)

Use the following two-step schema:
Step 1 - on Local Nodes
In this step, the PCA algorithm accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
data
Pointer to the
n
i
x
p
numeric table that represents the
i
-th data block on the local node. The input can be an object of any class derived from
NumericTable
.
In this step, PCA calculates the results described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
nObservationsCorrelation
Pointer to the 1 x 1 numeric table with the number of observations processed so far on the local node. By default, this result is an object of the
HomogenNumericTable
class, but you can define it as an object of any class derived from
NumericTable
except
CSRNumericTable
.
crossProductCorrelation
Pointer to the
p
x
p
numeric table with the cross-product matrix computed so far on the local node. By default, this table is an object of the
HomogenNumericTable
class, but you can define it as an object of any class derived from
NumericTable
except
PackedSymmetricMatrix
,
PackedTriangularMatrix
, and
CSRNumericTable
.
sumCorrelation
Pointer to the 1 x
p
numeric table with partial sums computed so far on the local node. By default, this table is an object of the
HomogenNumericTable
class, but you can define it as an object of any class derived from
NumericTable
except
PackedSymmetricMatrix
,
PackedTriangularMatrix
, and
CSRNumericTable
.
Step 2 - on Master Node
In this step, the PCA algorithm accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
partialResults
A collection that contains results computed in Step 1 on local nodes (
nObservationsCorrelation
,
crossProductCorrelation
, and
sumCorrelation
). The collection can contain objects of any class derived from
NumericTable
except the
PackedSymmetricMatrix
and
PackedTriangularMatrix
.
In this step, PCA calculates the results described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
eigenvalues
Pointer to the 1 x
p
numeric table that contains eigenvalues in the descending order. By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except
PackedSymmetricMatrix
,
PackedTriangularMatrix
, and
CSRNumericTable
.
eigenvectors
Pointer to the
p
x
p
numeric table that contains eigenvectors in the row-major order. By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except
PackedSymmetricMatrix
,
PackedTriangularMatrix
, and
CSRNumericTable
.
Examples

SVD Method (svdDense)

Use the following two-step schema:
Step 1 - on Local Nodes
In this step, the PCA algorithm accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
data
Pointer to the
n
i
x
p
numeric table that represents the
i
-th data block on the local node. The input can be an object of any class derived from
NumericTable
.
In this step, PCA calculates the results described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
nObservationsSVD
Pointer to the 1 x 1 numeric table with the number of observations processed so far on the local node. By default, this result is an object of the
HomogenNumericTable
class, but you can define it as an object of any class derived from
NumericTable
except
CSRNumericTable
.
sumSVD
Pointer to the 1 x
p
numeric table with partial sums computed so far on the local node. By default, this table is an object of the
HomogenNumericTable
class, but you can define it as an object of any class derived from
NumericTable
except
PackedSymmetricMatrix
,
PackedTriangularMatrix
, and
CSRNumericTable
.
sumSquaresSVD
Pointer to the 1 x
p
numeric table with partial sums of squares computed so far on the local node. By default, this table is an object of the
HomogenNumericTable
class, but you can define it as an object of any class derived from
NumericTable
except
PackedSymmetricMatrix
,
PackedTriangularMatrix
, and
CSRNumericTable
.
auxiliaryDataSVD
A collection of numeric tables each with the partial result to transmit to the master node for Step 2. The collection can contain objects of any class derived from
NumericTable
except the
PackedSymmetricMatrix
and
PackedTriangularMatrix
.
Step 2 - on Master Node
In this step, the PCA algorithm accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
partialResults
A collection that contains results computed in Step 1 on local nodes (
nObservationsSVD
,
sumSVD
,
sumSquaresSVD
, and
auxiliaryDataSVD
). The collection can contain objects of any class derived from
NumericTable
except
PackedSymmetricMatrix
and
PackedTriangularMatrix
.
In this step, PCA calculates the results described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
eigenvalues
Pointer to the 1 x
p
numeric table that contains eigenvalues in the descending order. By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except
PackedSymmetricMatrix
,
PackedTriangularMatrix
, and
CSRNumericTable
.
eigenvectors
Pointer to the
p
x
p
numeric table that contains eigenvectors in the row-major order. By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except
PackedSymmetricMatrix
,
PackedTriangularMatrix
, and
CSRNumericTable
.
Examples

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804