Getting Started Guide

Contents

Batch Processing

Algorithm Input

The K-Means clustering algorithm accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms .
Input ID
Input
data
Pointer to the
n
x
p
numeric table with the data to be clustered. The input can be an object of any class derived from
NumericTable
.
inputCentroids
Pointer to the
nClusters
x
p
numeric table with the initial centroids. The input can be an object of any class derived from
NumericTable
.

Algorithm Parameters

The K-Means clustering algorithm has the following parameters:
Parameter
Default Value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
Available computation methods for K-Means clustering:
  • defaultDense
    - implementation of Lloyd's algorithm
  • lloydCSR
    - implementation of Lloyd's algorithm for CSR numeric tables
nClusters
Not applicable
The number of clusters. Required to initialize the algorithm.
maxIterations
Not applicable
The number of iterations. Required to initialize the algorithm.
accuracyThreshold
0.0
The threshold for termination of the algorithm.
gamma
1.0
The weight to be used in distance calculation for binary categorical features.
distanceType
euclidean
The measure of closeness between points (observations) being clustered. The only distance type supported so far is the Euclidian distance.
assignFlag
true
A flag that enables computation of
assignments
, that is, assigning cluster indices to respective observations.

Algorithm Output

The K-Means clustering algorithm calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms .
Result ID
Result
centroids
Pointer to the
nClusters
x
p
numeric table with the cluster centroids. By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
assignments
Use when
assignFlag
=
true
. Pointer to the
n
x 1 numeric table with assignments of cluster indices to feature vectors in the input data. By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
DEPRECATED:
goalFunction
USE INSTEAD:
objectiveFunction
Pointer to the 1 x 1 numeric table with the value of the goal function. By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except
CSRNumericTable
.
nIterations
Pointer to the 1 x 1 numeric table with the actual number of iterations done by the algorithm. By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
except
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
You can skip update of
centroids
and
objectiveFunction
in the result and compute assignments using original
inputCentroids
. To do this, set
assignFlag
to
true
and
maxIterations
to zero.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804