Developer Guide and Reference

  • 2021.2
  • 03/26/2021
  • Public Content
Contents

Quality Metrics for Linear Regression

Given a data set LaTex Math image. that contains vectors of input variables LaTex Math image., respective responses LaTex Math image. computed at the prediction stage of the linear regression model defined by its coefficients LaTex Math image.,
h = 1, ldots, k
,
t = 1, ldots, p
, and expected responses LaTex Math image.,
i = 1, …, n
, the problem is to evaluate the linear regression model by computing the root mean square error, variance-covariance matrix of beta coefficients, various statistics functions, and so on. See Linear Regression for additional details and notations.
For linear regressions, the library computes statistics listed in tables below for testing insignificance of beta coefficients and one of the following values of
QualityMetricsId
:
For more details, see [Hastie2009].

Details

The statistics are computed given the following assumptions about the data distribution:
  • Responses LaTex Math image.,
    i = 1, …, n
    , are independent and have a constant variance LaTex Math image.,
    j = 1, …, k
  • Conditional expectation of responses LaTex Math image.,
    j = 1, …, k
    , is linear in input variables LaTex Math image.
  • Deviations of LaTex Math image.,
    i = 1, …, n
    , around the mean of expected responses LaTex Math image.,
    j = 1, …, k
    , are additive and Gaussian.
Testing Insignificance of a Single Beta
The library uses the following quality metrics:
Quality Metric
Definition
Root Mean Square (RMS) Error
LaTex Math image.,
j = 1, …, k
Vector of variances LaTex Math image.
LaTex Math image.,
j = 1, …, k
A set of variance-covariance matrices LaTex Math image. for vectors of betas LaTex Math image.,
j = 1, …, k
LaTex Math image.,
j = 1, …, k
Z-score statistics used in testing of insignificance of a single coefficient LaTex Math image.
LaTex Math image.,
j = 1, …, k
, LaTex Math image. is the
j
-th element of the vector of variance LaTex Math image. and LaTex Math image. is the
t
-th diagonal element of the matrix LaTex Math image.
Confidence interval for LaTex Math image.
LaTex Math image.,
j = 1, …, k
, LaTex Math image. is the LaTex Math image. percentile of the Gaussian distribution, LaTex Math image. is the
j
-th element of the vector of variance LaTex Math image., LaTex Math image. is the
t
-th diagonal element of the matrix LaTex Math image.
Testing Insignificance of a Group of Betas
The library uses the following quality metrics:
Quality Metric
Definition
Mean of expected responses, LaTex Math image.
LaTex Math image.,
j = 1, …, k
Variance of expected responses, LaTex Math image.
LaTex Math image.,
j = 1, …, k
Regression Sum of Squares LaTex Math image.
LaTex Math image.,
j = 1, …, k
Sum of Squares of Residuals LaTex Math image.
LaTex Math image.,
j = 1, …, k
Total Sum of Squares LaTex Math image.
LaTex Math image.,
j = 1, …, k
Determination Coefficient LaTex Math image.
LaTex Math image.,
j = 1, …, k
F-statistics used in testing insignificance of a group of betas LaTex Math image.
LaTex Math image.,
j = 1, …, k
, where LaTex Math image. are computed for a model with
p + 1
betas and LaTex Math image. are computed for a reduced model with LaTex Math image. betas (LaTex Math image. betas are set to zero)

Batch Processing

Testing Insignificance of a Single Beta
Algorithm Input
The quality metric algorithm for linear regression accepts the input described below. Pass the
Input ID
as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
expectedResponses
Pointer to the LaTex Math image. numeric table with responses (
k
dependent variables) used for training the linear regression model.
This table can be an object of any class derived from
NumericTable
.
model
Pointer to the model computed at the training stage of the linear regression algorithm.
The model can only be an object of the
linear_regression::Model
class.
predictedResponses
Pointer to the LaTex Math image. numeric table with responses (
k
dependent variables) computed at the prediction stage of the linear regression algorithm.
This table can be an object of any class derived from
NumericTable
.
Algorithm Parameters
The quality metric algorithm for linear regression has the following parameters:
Parameter
Default Value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
Performance-oriented computation method, the only method supported by the algorithm.
alpha
0.05
Significance level used in the computation of confidence intervals for coefficients of the linear regression model.
accuracyThreshold
0.001
Values below this threshold are considered equal to it.
Algorithm Output
The quality metric algorithm for linear regression calculates the result described below. Pass the
Result ID
as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
rms
Pointer to the LaTex Math image. numeric table that contains root mean square errors computed for each response (dependent variable)
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
, except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
variance
Pointer to the LaTex Math image. numeric table that contains variances LaTex Math image.,
j = 1, …, k
computed for each response (dependent variable).
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
, except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
betaCovariances
Pointer to the DataCollection object that contains
k
numeric tables, each with the LaTex Math image. variance-covariance matrix for betas of the j-th response (dependent variable), where m is the number of betas in the model (m is equal to p when interceptFlag is set to false at the training stage of the linear regression algorithm; otherwise, m is equal to p + 1 ).
The collection can contain objects of any class derived from
NumericTable
.
zScore
Pointer to the LaTex Math image. numeric table that contains the Z-score statistics used in the testing of insignificance of individual linear regression coefficients, where
m
is the number of betas in the model (
m
is equal to
p
when
interceptFlag
is set to
false
at the training stage of the linear regression algorithm; otherwise,
m
is equal to
p + 1
).
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
, except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
confidenceIntervals
Pointer to the LaTex Math image. numeric table that contains limits of the confidence intervals for linear regression coefficients:
  • LaTex Math image. is the left limit of the confidence interval computed for the
    j
    -th beta of the
    t
    -th response (dependent variable)
  • LaTex Math image. is the right limit of the confidence interval computed for the
    j
    -th beta of the
    t
    -th response (dependent variable),
where
m
is the number of betas in the model (
m
is equal to
p
when
interceptFlag
is set to
false
at the training stage of the linear regression algorithm; otherwise,
m
is equal to
p + 1
).
By default, this result is an object of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
, except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.
inverseOfXtX
Pointer to the LaTex Math image. numeric table that contains the LaTex Math image. matrix, where
m
is the number of betas in the model (
m
is equal to
p
when
interceptFlag
is set to
false
at the training stage of the linear regression algorithm; otherwise,
m
is equal to
p + 1
).
Testing Insignificance of a Group of Betas
Algorithm Input
The quality metric algorithm for linear regression accepts the input described below. Pass the
Input ID
as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID
Input
expectedResponses
Pointer to the LaTex Math image. numeric table with responses (
k
dependent variables) used for training the linear regression model.
This table can be an object of any class derived from
NumericTable
.
predictedResponses
Pointer to the LaTex Math image. numeric table with responses (
k
dependent variables) computed at the prediction stage of the linear regression algorithm.
This table can be an object of any class derived from
NumericTable
.
predictedReducedModelResponses
Pointer to the LaTex Math image. numeric table with responses (
k
dependent variables) computed at the prediction stage of the linear regression algorithm using the reduced linear regression model, where LaTex Math image. out of
p
beta coefficients are set to zero.
This table can be an object of any class derived from
NumericTable
.
Algorithm Parameters
The quality metric algorithm for linear regression has the following parameters:
Parameter
Default Value
Description
algorithmFPType
float
The floating-point type that the algorithm uses for intermediate computations. Can be
float
or
double
.
method
defaultDense
Performance-oriented computation method, the only method supported by the algorithm.
numBeta
0
Number of beta coefficients used for prediction.
numBetaReducedModel
0
Number of beta coefficients (LaTex Math image.) used for prediction with the reduced linear regression model, where LaTex Math image. out of
p
beta coefficients are set to zero.
Algorithm Output
The quality metric algorithm for linear regression calculates the result described below. Pass the
Result ID
as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID
Result
expectedMeans
Pointer to the LaTex Math image. numeric table that contains the mean of expected responses computed for each dependent variable.
expectedVariance
Pointer to the LaTex Math image. numeric table that contains the variance of expected responses computed for each dependent variable.
regSS
Pointer to the LaTex Math image. numeric table that contains the regression sum of squares computed for each dependent variable.
resSS
Pointer to the LaTex Math image. numeric table that contains the sum of squares of residuals computed for each dependent variable.
tSS
Pointer to the LaTex Math image. numeric table that contains the total sum of squares computed for each dependent variable.
determinationCoeff
Pointer to the LaTex Math image. numeric table that contains the determination coefficient computed for each dependent variable.
fStatistics
Pointer to the LaTex Math image. numeric table that contains the F-statistics computed for each dependent variable.
By default, these results are objects of the
HomogenNumericTable
class, but you can define the result as an object of any class derived from
NumericTable
, except for
PackedTriangularMatrix
,
PackedSymmetricMatrix
, and
CSRNumericTable
.

Examples

C++ (CPU)
Java*
There is no support for Java on GPU.
Batch Processing:

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.