K-Means Clustering

K-Means is among the most popular and simplest clustering methods. It is intended to partition a data set into a small number of clusters such that feature vectors within a cluster have greater similarity with one another than with feature vectors from other clusters. Each cluster is characterized by a representative point, called a centroid, and a cluster radius.

Performance Considerations

To get the best overall performance of the association rules algorithm, whenever possible use the following numeric tables and data types:

  • A SOA numeric table of type int to store features.

  • A homogenous numeric table of type int to store large item sets, support values, and left-hand-side and right-hand-side parts of association rules.

Multi-class Classifier

While some classification algorithms naturally permit the use of more than two classes, some algorithms, such as Support Vector Machines (SVM), are by nature solving a two-class problem only. These two-class (or binary) classifiers can be turned into multi-class classifiers by using different strategies, such as One-Against-Rest or One-Against-One.

S’abonner à Java*