SVD multithreading bug in MKL

SVD multithreading bug in MKL

Hi,

It seems like that mkl has a serious bug in ?gesvd.

First, I tried to find singular values using the Armadillo library with mkl backends, but results are different according to MKL_NUM_THREADS.

for example,

/* -*- mode: c++; -*- */
#include <armadillo>

int main(int argc, char *argv[])
{
    using namespace arma;

    mat M = randu<mat>(957, 957);
    mat U, V;
    vec s;

    svd(U, s, V, M);
    mat newM = U * diagmat(s) * V.t();

    printf("norm: %f\n", arma::norm(M - newM,2));
    return 0;
}

yields the following results

$ MKL_NUM_THREADS=1 ./main
norm: 0.000000
$ MKL_NUM_THREADS=2 ./main
norm: 0.000000
$ MKL_NUM_THREADS=3 ./main
norm: 0.000000
$ MKL_NUM_THREADS=4 ./main
norm: 0.000000
$ MKL_NUM_THREADS=5 ./main
norm: 371.303371
$ MKL_NUM_THREADS=6 ./main
norm: 138.622780
$ MKL_NUM_THREADS=7 ./main
norm: 138.622780

I wrote another program that uses mkl only, but it was worse.

/* -*- mode: c++; -*- */
#include <mkl.h>
#include <string.h>
#include <math.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
    lapack_int M = 958;
    double *mat = new double[M * M];
    double *mat_orig = new double[M * M];
    double *s = new double[M];
    double *diags = new double[M*M];
    double *U = new double[M * M];
    double *Vt = new double[M * M];
    double *superb = new double[M-2];

    for (int i=0; i<M*M; ++i)
        mat[i] = static_cast<double>(rand())/RAND_MAX;

    memcpy(mat_orig, mat, sizeof(double) * M * M);
    LAPACKE_dgesvd(LAPACK_COL_MAJOR, 'a', 'a', M, M, mat, M, s, U, M, Vt, M, superb);

    memset(diags, 0, sizeof(double) * M * M);
    for (int i=0; i<M; ++i)
        diags[i * M + i] = s[i];

    cblas_dgemm(CblasColMajor, CblasNoTrans, CblasNoTrans, M, M, M, 1.0, U, M, diags, M, 0.0, mat, M);
    cblas_dgemm(CblasColMajor, CblasNoTrans, CblasNoTrans, M, M, M, 1.0, mat, M, Vt, M, 0.0, U, M);

    double res = 0.0;
    for (int i=0; i<M*M; ++i) {
        double t = mat_orig[i] - U[i];
        res += (t*t);
    }
    printf("norm: %f\n", sqrt(res));

    delete[] superb;
    delete[] Vt;
    delete[] U;
    delete[] diags;
    delete[] s;
    delete[] mat_orig;
    delete[] mat;
    return 0;
}

$ MKL_NUM_THREADS=1 ./main2
norm: 0.000000
$ MKL_NUM_THREADS=2 ./main2
norm: 0.000000
$ MKL_NUM_THREADS=3 ./main2
norm: 0.000000
$ MKL_NUM_THREADS=4 ./main2
norm: 0.000000
$ MKL_NUM_THREADS=5 ./main2
norm: 457.024091
$ MKL_NUM_THREADS=6 ./main2
Segmentation fault (core dumped)
$ MKL_NUM_THREADS=7 ./main2
Segmentation fault (core dumped)

What's wrong with this? I'm using the lastest mkl + intel compiler (11.0 update 3) in ubuntu 12.04 LTS, intel i7 990X machine.

12 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.

thanks for the report, we will check the problem on our side.

how did you link the test - statically or dinamically? is that LP64 or ILP64 interfaces?

I used -mkl option, such as

icpc main.cpp -o main -mkl -O3

yes, I see the similar behavioir on my side with the latest 11.0 update 3.

icc -mkl test.cpp. 

# export MKL_NUM_THREADS=1
# a.out
norm: 0.000000

# export MKL_NUM_THREADS=2
# ./a.out
norm: 0.000000

# export MKL_NUM_THREADS=3
# ./a.out
norm: 0.000000

# export MKL_NUM_THREADS=4
# ./a.out
norm: 0.000000
# export MKL_NUM_THREADS=5
# ./a.out
norm: 457.024091
# export MKL_NUM_THREADS=6
# ./a.out
Segmentation fault (core dumped)

We will digg the issue and will keep you updated as much as possible.

the issue is escalated to the owner of SVD. here is the number of the issue for your reference : DPD200335246

the fix of the problem available in version 11.0 update4 which released the last week.

please check the problem and let us know the results.

are there any updates regard to this issue?

Bild des Benutzers mecej4

Gennady,

I am reading this thread rather late, and I find it slightly puzzling. I cannot comment about the Armadillo library, since I don't have it. However, the OP's straight MKL example has a user error:

double *superb = new double[M-2];

should be

double *superb = new double[M-1];

since the documentation states "On exit, superb(0:min(m,n)-2) contains...". 

This error may, of course, be additional to and independent of the suspected threading bug in MKL but, even with the latest MKL update, it caused seg-faults on Windows because the array is under-allocated..

mecej, thanks a lot. this is an additional probem. You absolutely right. The test is not correct and we didn't check this example after update 4 has been released. 

I am still observing this problem 11.0 update 5. Can anybody else confirm this. Is this being worked on?

Michael, 

This is surprise for us. Do you use the example from the original post from YoungTaek O? if not then pls give us the exact example to check the problem on our side.

It would be useful to see the info about version of MKL and Processor type which have been used.

int version(void) {

  MKLVersion Version;
  mkl_get_version(&Version);
  printf("Major version: %d\n",Version.MajorVersion);
  printf("Minor version: %d\n",Version.MinorVersion);
  printf("Update version: %d\n",Version.UpdateVersion);
  printf("Product status: %s\n",Version.ProductStatus);
  printf("Build: %s\n",Version.Build);
  printf("Platform: %s\n",Version.Platform);
  printf("Processor optimization: %s\n",Version.Processor);
  return 0;
}

Melden Sie sich an, um einen Kommentar zu hinterlassen.