Developer Guide and Reference

  • 2021.3
  • 06/28/2021
  • Public Content
Contents

Regression Gradient Boosted Trees

Gradient boosted trees regression is the special case of gradient boosted trees. For more details, see Gradient Boosted Trees.

Details

Given n feature vectors LaTex Math image. of :math`n`
p
-dimensional feature vectors and a vector of dependent variables LaTex Math image., the problem is to build a gradient boosted trees regression model that minimizes the loss function based on the predicted and true value.
Training Stage
Gradient boosted trees regression follows the algorithmic framework of gradient boosted trees training with following loss
LaTex Math image.
Prediction Stage
Given the gradient boosted trees regression model and vectors LaTex Math image., the problem is to calculate responses for those vectors. To solve the problem for each given feature vector LaTex Math image., the algorithm finds the leaf node in a tree in the ensemble, and the leaf node gives the tree response. The algorithm result is a sum of responses of all the trees.

Usage of Training Alternative

To build a Gradient Boosted Trees Regression model using methods of the Model Builder class of Gradient Boosted Tree Regression, complete the following steps:
  • Create a Gradient Boosted Tree Regression model builder using a constructor with the required number of classes and trees.
  • Create a decision tree and add nodes to it:
    • Use the
      createTree
      method with the required number of nodes in a tree and a label of the class for which the tree is created.
    • Use the
      addSplitNode
      and
      addLeafNode
      methods to add split and leaf nodes to the created tree. See the note below describing the decision tree structure.
    • After you add all nodes to the current tree, proceed to creating the next one in the same way.
  • Use the
    getModel
    method to get the trained Gradient Boosted Trees Regression model after all trees have been created.
Each tree consists of internal nodes (called non-leaf or split nodes) and external nodes (leaf nodes). Each split node denotes a feature test that is a Boolean expression, for example, f <
featureValue
or f =
featureValue
, where f is a feature and
featureValue
is a constant. The test type depends on the feature type: continuous, categorical, or ordinal. For more information on the test types, see Decision Tree.
The inducted decision tree is a binary tree, meaning that each non-leaf node has exactly two branches: true and false. Each split node contains
featureIndex
, the index of the feature used for the feature test in this node, and
featureValue
, the constant for the Boolean expression in the test. Each leaf node contains a
classLabel
, the predicted class for this leaf. For more information on decision trees, see Decision Tree.
Add nodes to the created tree in accordance with the pre-calculated structure of the tree. Check that the leaf nodes do not have children nodes and that the splits have exactly two children.
Examples
C++ (CPU)
Java*
There is no support for Java on GPU.
Python*

Batch Processing

Gradient boosted trees regression follows the general workflow described in Gradient Boosted Trees and Regression Usage Model.
Training
In addition to parameters of the gradient boosted trees described in Batch Processing, the gradient boosted trees regression training algorithm has the following parameters:
Parameter
Default Value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
The computation method used by the gradient boosted trees regression. The only training method supported so far is the default dense method.
loss
squared
Loss function type.
Prediction
In addition to the common regression parameters, the gradient boosted trees regression has the following parameters at the prediction stage:
Parameter
Default Value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
The computation method used by the gradient boosted trees regression. The only training method supported so far is the default dense method.
numIterations
0
An integer parameter that indicates how many trained iterations of the model should be used in prediction. The default value
0
denotes no limit. All the trained trees should be used.

Examples

C++ (CPU)
Batch Processing:
Java*
There is no support for Java on GPU.
Batch Processing:
Python* with DPC++ support
Batch Processing:
Python*
Batch Processing:

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.