Developer Guide

Contents

Details

Given
n
feature vectors
X
= {
x
1
= (
x
11
,...,
x
1
p
),...,
x
n
= (
x
n
1
,...,
x
np
) } of
n
p
-dimensional feature vectors and a vector of class labels
y
= (
y
1
,...,
y
n
), where
y
i
∈ {0, 1, ...,
C
- 1} and
C
is the number of classes, which describes the class to which the feature vector
x
i
belongs, the problem is to build a gradient boosted trees classifier.

Training Stage

Gradient boosted trees classification follows the algorithmic framework of gradient boosted trees training. For a classification problem with K classes, K regression trees are constructed on each iteration, one for each output class. The loss function is cross-entropy (multinomial deviance):
where
Binary classification is a special case when single regression tree is trained on each iteration. The loss function is
, where
.

Prediction Stage

Given the gradient boosted trees classifier model and vectors
x
1
,...,
x
r
, the problem is to calculate labels for those vectors. To solve the problem for each given feature vector
x
i
, the algorithm finds the leaf node in a tree in the ensemble, and the leaf node gives the tree response. The algorithm computes a sum of responses of all the trees for each class and chooses the label
y
corresponding to the class with the maximal response value (highest class probability).

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804