Developer Guide

Contents

Batch Processing

Decision tree classification follows the general workflow described in Training and Prediction > Classification > Usage Model .

Training

For the description of the input and output, refer to Training and Prediction > Classification > Usage Model . In addition to common input, decision trees can accept the following inputs used for post-pruning:
Input ID
Input
dataForPruning
Pointer to the
m
x
p
numeric table with the pruning data set. This table can be an object of any class derived from
NumericTable
.
labelsForPruning
Pointer to the
m
x 1 numeric table with class labels. This table can be an object of any class derived from
NumericTable
except
PackedSymmetricMatrix
and
PackedTriangularMatrix
.
At the training stage, decision tree classifier has the following parameters:
Parameter
Default Value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
The computation method used by the decision tree classification. The only training method supported so far is the default dense method.
nClasses
Not applicable
The number of classes. A required parameter.
splitCriterion
infoGain
Split criterion to choose the best test for split nodes. Available split criteria for decision trees:
  • gini
    - the Gini index
  • infoGain
    - the information gain
pruning
reducedErrorPruning
Method to perform post-pruning. Available options for the pruning parameter:
  • reducedErrorPruning
    - reduced error pruning. Provide
    dataForPruning
    and
    labelsForPruning
    inputs, if you use pruning.
  • none
    - do not prune.
maxTreeDepth
0
Maximum tree depth. Zero value means unlimited depth. Can be any non-negative number.
minObservationsInLeafNodes
1
Minimum number of observations in the leaf node. Can be any positive number.
nClasses
2
The number of classes.

Prediction

For the description of the input and output, refer to Training and Prediction > Classification > Usage Model .
At the prediction stage, decision tree classifier has the following parameters:
Parameter
Default Value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
The computation method used by the decision tree classification. The only training method supported so far is the default dense method.
nBins
1
The number of bins used to compute probabilities of the observations belonging to the class. The only supported value for current version of the library is 1.
nClasses
2
The number of classes.
resultsToEvaluate
classifier:: computeClassesLabels
The form of computed results:
  • classifier::computeClassLabels
    - the result contains a numeric table of size
    n
    x1 with predicted labels
  • classifier::computeClassProbabilities
    - the result contains a numeric table of size
    n
    x
    nClasses
    with probabilities to belong to each class

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804