I was using daal4py for large data set. It is super fast, but the result looks wrong.
I had already searched the web and I know that DAAL4py will normalize data before compute PCA. But still, I normalized data and pass to sklearn, the eigen values and vectors varying a lot!
Attached the code and data, I like to see a consist data between them. Otherwise, we are not convinced we could use DAAL in our production. Please note that preprocessing data before passing to sklearn is acceptable. But I tried both minmax/zscore method. The results are quite different from intel PCA results.
test.zip has the testing data, which is 10000 * 512 tensor.
Run the script on my windows I got the following result.
Intel engvals = [136.30274983 85.45575273 51.9877961 ]
Sklearn engvals = [213.9291753 102.09328516 76.81426116]