Developer Guide and Reference

  • 2021.1
  • 12/04/2020
  • Public Content
Contents

Batch Processing

Algorithm Input

The K-Means clustering algorithm accepts the input described below. Pass the
Input ID
as a parameter to the methods that provide input for your algorithm.
Input ID
Input
data
Pointer to the LaTex Math image. numeric table with the data to be clustered.
inputCentroids
Pointer to the LaTex Math image. numeric table with the initial centroids.
The input for
data
and
inputCentroids
can be an object of any class derived from
NumericTable
.

Algorithm Parameters

The K-Means clustering algorithm has the following parameters:
Parameter
Default Value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
Available computation methods for K-Means clustering:
For CPU:
  • defaultDense
    - implementation of Lloyd’s algorithm
  • lloydCSR
    - implementation of Lloyd’s algorithm for CSR numeric tables
For GPU:
  • defaultDense
    - implementation of Lloyd’s algorithm
nClusters
Not applicable
The number of clusters. Required to initialize the algorithm.
maxIterations
Not applicable
The number of iterations. Required to initialize the algorithm.
accuracyThreshold
0.0
The threshold for termination of the algorithm.
gamma
1.0
The weight to be used in distance calculation for binary categorical features.
distanceType
euclidean
The measure of closeness between points (observations) being clustered. The only distance type supported so far is the Euclidian distance.
DEPRECATED:
assignFlag
USE INSTEAD:
resultsToEvaluate
true
A flag that enables computation of assignments, that is, assigning cluster indices to respective observations.
resultsToEvaluate
computeCentroids
|
computeAssignments
|
computeExactObjectiveFunction
The 64-bit integer flag that specifies which extra characteristics of the K-Means algorithm to compute.
Provide one of the following values to request a single characteristic or use bitwise OR to request a combination of the characteristics:
  • computeCentroids
    for computation centroids.
  • computeAssignments
    for computation of assignments, that is, assigning cluster indices to respective observations.
  • computeExactObjectiveFunction
    for computation of exact ObjectiveFunction.

Algorithm Output

The K-Means clustering algorithm calculates the result described below. Pass the
Result ID
as a parameter to the methods that access the results of your algorithm.
Result ID
Result
centroids
Pointer to the LaTex Math image. numeric table with the cluster centroids, computed when
computeCentroids
option is enabled.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
assignments
Pointer to the LaTex Math image. numeric table with assignments of cluster indices to feature vectors in the input data, computed when
computeAssignments
option is enabled.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
objectiveFunction
Pointer to the LaTex Math image. numeric table with the minimum value of the objective function obtained at the last iteration of the algorithm, might be inexact. When
computeExactObjectiveFunction
option is enabled, exact objective function is computed.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
nIterations
Pointer to the LaTex Math image. numeric table with the actual number of iterations done by the algorithm.
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
You can skip update of centroids and objectiveFunction in the result and compute assignments using original inputCentroids. To do this, set
resultsToEvaluate
flag only to
computeAssignments
and
maxIterations
to zero.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.