Developer Guide and Reference

  • 2021.1
  • 12/04/2020
  • Public Content
Contents

Logistic Regression

Logistic regression is a method for modeling the relationships between one or more explanatory variables and a categorical variable by expressing the posterior statistical distribution of the categorical variable via linear functions on observed data. If the categorical variable is binary, taking only two values, “0” and “1”, the logistic regression is simple, otherwise, it is multinomial.

Details

Given n feature vectors of n p-dimensional feature vectors a vector of class labels LaTex Math image. , where LaTex Math image. and
K
is the number of classes, describes the class to which the feature vector LaTex Math image. belongs, the problem is to train a logistic regression model.
The logistic regression model is the set of vectors LaTex Math image. that gives the posterior probability
LaTex Math image.
for a given feature vector LaTex Math image. and class label LaTex Math image. for each LaTex Math image. . See [Hastie2009].
If the categorical variable is binary, the model is defined as a single vector LaTex Math image. that determines the posterior probability
LaTex Math image.
Training Stage
Training procedure is an iterative algorithm which minimizes objective function
LaTex Math image.
where the first term is the negative log-likelihood of conditional
Y
given
X
, and the latter terms are regularization ones that penalize the complexity of the model (large LaTex Math image. values), LaTex Math image. and LaTex Math image. are non-negative regularization parameters applied to L1 and L2 norm of vectors in LaTex Math image. .
For more details, see [Hastie2009], [Bishop2006].
For the objective function minimization the library supports the iterative algorithms defined by the interface of daal::algorithms::iterative_solver. See Iterative Solver.
Prediction Stage
Given logistic regression model and vectors LaTex Math image. , the problem is to calculate the responses for those vectors, and their probabilities and logarithms of probabilities if required. The computation is based on formula (1) in multinomial case and on formula (2) in binary case.

Usage of Training Alternative

To build a Logistic Regression model using methods of the Model Builder class of Logistic Regression, complete the following steps:
  • Create a Logistic Regression model builder using a constructor with the required number of responses and features.
  • Use the
    setBeta
    method to add the set of pre-calculated coefficients to the model. Specify random access iterators to the first and the last element of the set of coefficients [ISO/IEC 14882:2011 §24.2.7]_.
    If your set of coefficients does not contain an intercept, interceptFlag is automatically set to
    False
    , and to
    True
    , otherwise.
  • Use the
    getModel
    method to get the trained Logistic Regression model.
  • Use the
    getStatus
    method to check the status of the model building process. If
    DAAL_NOTHROW_EXCEPTIONS
    macros is defined, the status report contains the list of errors that describe the problems API encountered (in case of API runtime failure).
If after calling the
getModel
method you use the
setBeta
method to update coefficients, the initial model will be automatically updated with the new LaTex Math image. coefficients.
Examples
Java*
There is no support for Java on GPU.

Batch Processing

Logistic regression algorithm follows the general workflow described in Classification Usage Model.
Training
For a description of the input and output, refer to Classification Usage Model.
In addition to the parameters of classifier described in Classification Usage Model, the logistic regression batch training algorithm has the following parameters:
Parameter
Default Value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
The computation method used by the logistic regression. The only training method supported so far is the default dense method.
nClasses
Not applicable
The number of classes. A required parameter.
interceptFlag
True
A flag that indicates a need to compute LaTex Math image.
penaltyL1
0
L1 regularization coefficient
L1 regularization is not supported on GPU.
penaltyL2
0
L2 regularization coefficient
optimizationSolver
Prediction
For a description of the input, refer to Classification Usage Model.
At the prediction stage logistic regression batch algorithm has the following parameters:
Parameter
Default Value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
The computation method used by logistic regression. The only prediction method supported so far is the default dense method.
nClasses
Not applicable
The number of classes. A required parameter.
resultsToCompute
computeClassesLabels
The 64-bit integer flag that specifies which extra characteristics of the logistic regression to compute.
Provide one of the following values to request a single characteristic or use bitwise OR to request a combination of the characteristics:
  • computeClassesLabels
    for
    prediction
  • computeClassesProbabilities
    for
    probabilities
  • computeClassesLogProbabilities
    for
    logProbabilities
Output
In addition to classifier output, logistic regression prediction calculates the result described below. Pass the
Result ID
as a parameter to the methods that access the results of your algorithm.
Result ID
Result
probabilities
A numeric table of size LaTex Math image. containing probabilities of classes computed when
computeClassesProbabilities
option is enabled.
logProbabilities
A numeric table of size LaTex Math image. containing logarithms of classes’ probabilities computed when
computeClassesLogProbabilities
option is enabled.
Note that:
  • If
    resultsToCompute
    does not contain
    computeClassesLabels
    , the
    prediction
    table is
    NULL
    .
  • If
    resultsToCompute
    does not contain
    computeClassesProbabilities
    , the
    probabilities
    table is
    NULL
    .
  • If
    resultsToCompute
    does not contain
    computeClassesLogProbabilities
    , the
    logProbabilities
    table is
    NULL
    .
  • By default, each numeric table of this result is an object of the
    HomogenNumericTable
    class, but you can define the result as an object of any class derived from
    NumericTable
    except for
    PackedSymmetricMatrix
    and
    PackedTriangularMatrix
    .

Examples

Java*
There is no support for Java on GPU.
Batch Processing:
Python* with DPC++ support

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.