Linear regression module, issues

Linear regression module, issues

 I have examined the linear regression module of the DAAL 2018 (Windows version)

1) In the "single_beta" submodule, the RMS error and the variance are inconsistent, i.e. var(j) != rms2(j)*n/(m-p-1) 

IMO, in linear_regression_single_beta_dense_default_batch_impl.i, line #316 should be

 pRms[j] = daal::internal::Math<algorithmFPType,cpu>::sSqrt(div1*pRms[j]);

instead of

 pRms[j] = div1*daal::internal::Math<algorithmFPType,cpu>::sSqrt(pRms[j]);

2) The "group_of_betas" submodule contains an unusual definition of the goodness parameter R2

To my best knowledge, R2 runs from 0 ("no fit") to 1 ("perfect fit"). In the DAAL implementation (and documentation),

R2 runs from 0 ("no fit") to 1/n ("perfect fit"). I don't whether this is a buf or a feature

6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Reinhard, which file are you referring to? thanks

Topic 1: The file is "linear_regression_single_beta_dense_default_batch_impl.i", line #316

Topic 2: IMO, the regression sum of squares RegSS is just the squares and not the mean square. Therefore,

            RegSS != TSS - ResSS;

            In your example, TSS-ResSS ≈ n*RegSS. I think, this is the root cause if the R^2 issue.


Thanks for pointing this out. The files are from our GitHub. We will investigate this with our engineering team. Thank you!

Hi Reinhard, we are analyzing the question and will get back to you with details. Thanks!

Hello Reinhard,

Both of your observations are correct –there are bugs in the quality metrics for linear regression in Intel DAAL 2018 Gold. We will fix them in one of future releases of the library.

Leave a Comment

Please sign in to add a comment. Not a member? Join today