Distributed Processing
Algorithm Parameters
Parameter
 Default Value
 Description
 

computeStep  Not applicable
 The parameter required to initialize the algorithm. Can be:
 
algorithmFPType  float  The floatingpoint type that the algorithm uses for intermediate computations. Can be
float or
double .
 
method  defaultDense  Available computation methods for KMeans clustering:
 
nClusters  Not applicable
 The number of clusters. Required to initialize the algorithm.
 
gamma  1.0
 The weight to be used in distance calculation for binary categorical features.
 
distanceType  euclidean  The measure of closeness between points (observations) being clustered. The only distance type supported so far is the Euclidian distance.
 
assignFlag  false  A flag that enables computation of assignments, that is, assigning cluster indices to respective observations.

Step 1  on Local Nodes
Input ID
 Input
 

data  Pointer to the
n _{i} x
p numeric table that represents the
i th data block on the local node. The input can be an object of any class derived from
NumericTable .
 
inputCentroids  Pointer to the
nClusters x
p numeric table with the initial cluster centroids. This input can be an object of any class derived from
NumericTable .

Partial Result ID
 Result
 

nObservations  Pointer to the
nClusters x 1 numeric table that contains the number of observations assigned to the clusters on local node. By default, this result is an object of the
HomogenNumericTable class, but you can define this result as an object of any class derived from
NumericTable except
CSRNumericTable .
 
partialSums  Pointer to the
nClusters x
p numeric table with partial sums of observations assigned to the clusters on the local node. By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable .
 
DEPRECATED:
partialGoalFunction USE INSTEAD:
partialObjectiveFunction  Pointer to the 1 x 1 numeric table that contains the value of the partial goal function for observations processed on the local node. By default, this result is an object of the
HomogenNumericTable class, but you can define this result as an object of any class derived from
NumericTable except
CSRNumericTable .
 
partialCandidatesDistances  Pointer to the
nClusters x 1 numeric table that contains the value of the
nClusters largest goal function for the observations processed on the local node and stored in descending order. By default, this result if an object of the
HomogenNumericTable class, but you can define this result as an object of any class derived from
NumericTable except
PackedTriangularMatrix ,
PackedSymmetricMatrix ,
CSRNumericTable .
 
partialCandidatesCentroids  Pointer to the
nClusters x 1 numeric table that contains the observations of the
nClusters largest goal function value processed on the local node and stored in descending order of the goal function. By default, this result if an object of the
HomogenNumericTable class, but you can define this result as an object of any class derived from
NumericTable except
PackedTriangularMatrix ,
PackedSymmetricMatrix ,
CSRNumericTable .
 
Result ID
 Result
 
assignments  Use when
assignFlag = true. Pointer to the
n _{i} x 1 numeric table with 32bit integer assignments of cluster indices to feature vectors in the input data on the local node. By default, this result is an object of the
HomogenNumericTable class, but you can define this result as an object of any class derived from
NumericTable except
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable .

Step 2  on Master Node
Result ID
 Result
 

centroids  Pointer to the
nClusters x
p numeric table with the cluster centroids. By default, this result is an object of the
HomogenNumericTable class, but you can define the result as an object of any class derived from
NumericTable except
PackedTriangularMatrix ,
PackedSymmetricMatrix , and
CSRNumericTable .
 
DEPRECATED:
goalFunction USE INSTEAD:
objectiveFunction  Pointer to the 1 x 1 numeric table that contains the value of the goal function. By default, this result is an object of the
HomogenNumericTable class, but you can define this result as an object of any class derived from
NumericTable except
CSRNumericTable .
