A bug in zgelsd in MKL 11.0.2

A bug in zgelsd in MKL 11.0.2

I get these error messages when calling zgelsd in MKL 11.0.2:

MKL INTERNAL ERROR: Condition 1 detected in function DLASD4.

MKL INTERNAL ERROR: Condition 1 detected in function DLASD8.

I prepared a small C program which generates these error messages and shows that info = 1. I ran this program on different Linux machines with the same result. When I switch to using MKL 10 these error messages disappear and the program returns a correct answer.

Zbigniew

PS. I am attaching the code (zgelsd_bug.c) and data file (mat_rhs_vals.h). I have changed the extension from .dat to .h because .dat was not accepted as an attachment.

AttachmentSize
Downloadtext/x-chdr mat-rhs-vals.h23.5 KB
Downloadtext/x-csrc zgelsd-bug.c3.31 KB
18 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

ok, thanks for the report. we will check what is wrong with this case.

Zbigniew, let try to link with lmkl_intel_lp64 instead of lmkl_intel_ilp64. Will you see the problem?

Gennady, I replaced mkl_intel_ilp64 with mkl_intel_lp64 andI still get exactly these two error messages, but this time info = 257698037761 (previously info = 1).

Zbigniew, 

yes, we reproduced the behaviour on our side. We will debug the cause the problem and let you know the update asap. is this blocking issue for you? and you mentioned that you didn't see the problem with 10.*. What exact version of mkl have you used? 10.0, 10.1, 10.2 or 10.3 ( the exact version you find into the file: e.g ../Documentation/mklsupport --  Package ID: l_mkl_11.1.0.016  )

--Gennady

No, it is not a blocking issue for me. I used the version 10.3.5 to make comparisons.

the fix of the problem available in 11.0 update 4 released recently. would you please check and let us know the results.

Gennady Fedorov

After installing MKL 11.0.4 and running my small program, I do not see any difference. The program still produces the error messages:

MKL INTERNAL ERROR: Condition 1 detected in function DLASD4

MKL INTERNAL ERROR: Condition 1 detected in function DLASD8

Zbigniew

thanks for letting us know, we will recheck the issue and will back to you soon

Dear Zbigniew,

Except a long living bug in the LAPACK sources, there is a typo in your code. Namely the following line is incorrect

typedef long int integer;

According the MKL User’s Guide, if you are using the LP64 MKL binaries (e.g. using the following linking line

 $ icc  -I$MKLROOT/mkl/include zgelsd_bug.c $MKLROOT/mkl/lib/intel64/libmkl_intel_lp64.a -Wl,--start-group $MKLROOT/mkl/lib/intel64/libmkl_intel_thread.a $MKLROOT/mkl/lib/intel64/libmkl_core.a -Wl,--end-group   $MKLROOT/compiler/lib/intel64/libiomp5.so  -lpthread)

the integer must be defined in the following way

typedef int integer;

In the case of using the ILP64 MKL binaries binaries (e.g. using the following linking line

 $ icc  -I$MKLROOT/mkl/include zgelsd_bug.c $MKLROOT/mkl/lib/intel64/libmkl_intel_ilp64.a -Wl,--start-group $MKLROOT/mkl/lib/intel64/libmkl_intel_thread.a $MKLROOT/mkl/lib/intel64/libmkl_core.a -Wl,--end-group   $MKLROOT/compiler/lib/intel64/libiomp5.so  -lpthread) ,

the correct definition of integer is the following:

 typedef long long integer;

 In the both cases, you code linked with MKL 11.0.4 returns info=0:

$./a.out

Reading the input matrix...

Reading the input RHS...

Done

info=0 

Thanks

All the best

Sergey

It was not a bug in my program because I compile the program under Linux using gcc (with several different 4.x versions including the recent one 4.7.2), and for this platform long = long long. But nevertheless, now I use "long long" and still get the same error message. Here is the way I compile the program::

gcc -Wall -g -O0 -o zgelsd_bug zgelsd_bug.c -fno-strict-aliasing -L $LD_LIBRARY_PATH -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -liomp5 -lm

where LD_LIBRARY_PATH points to MKL 11.0.4.

Zbigniew

I forgot to mention that I use only dynamic MKL files (.so) not static ones (.a).

Zbigniew

Dear Zbigniew,

I  checked the MKL 11.0.4 with the dynamic MKL binaries and many gcc compilers and everything works as expected

$ gcc -Wall -g -O0 -o zgelsd_bug zgelsd_bug.c -fno-strict-aliasing -L$MKLROOT/intel64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -liomp5 -lm

$ ./zgelsd_bug                                
Reading the input matrix...
Reading the input RHS...
Done
info = 0

Would you please try static linking with gcc and inform us results of execution? 

In order to make sure that your executable has been using MKL 11.0.4, I'd recommend to call the function MKL_Get_Version_String(buf, buf_len). Please insert the following lines in your code

 #define buf_len 198
 
    char buf[buf_len];
    printf("\nMKL release version:\n");
    MKL_Get_Version_String(buf, buf_len);
    printf("%s\n",buf);
 

and  if the environment variables are correct,  you should get something like this

$ gcc -Wall -g -O0 -o zgelsd_bug zgelsd_bug.c -fno-strict-aliasing -L$MKLROOT/intel64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -liomp5 -lm

$ ./zgelsd_bug                          
 
MKL release version:
Intel(R) Math Kernel Library Version 11.0.4 Product Build 20130422 for Intel(R) 64 architecture applications
Reading the input matrix...
Reading the input RHS...
Done
info = 0

All the best

Sergey

I inserted the function MKL_Get_Version_String into my code. Then I compiled it using the static MKL libraries. Here are the command lines I used:

setenv LD_LIBRARY_PATH /opt/intel/composer_xe_2013.4.183/compiler/lib/intel64
setenv MKLROOT /opt/intel/composer_xe_2013.4.183

gcc -I$MKLROOT/mkl/include -o zgelsd_bug zgelsd_bug.c -Wl,--start-group $MKLROOT/mkl/lib/intel64/libmkl_intel_ilp64.a $MKLROOT/mkl/lib/intel64/libmkl_intel_thread.a $MKLROOT/mkl/lib/intel64/libmkl_core.a -Wl,--end-group $MKLROOT/compiler/lib/intel64/libiomp5.so -lm -lpthread -ldl

Then I ran ./zgelsd_bug and got this output:

MKL release version:
Intel(R) Math Kernel Library Version 11.0.4 Product Build 20130517 for Intel(R) 64 architecture applications
Reading the input matrix...
Reading the input RHS...
Done

MKL INTERNAL ERROR: Condition 1 detected in function DLASD4.

MKL INTERNAL ERROR: Condition 1 detected in function DLASD8.
info = 1

So the bug is still present unless I do something wrong. I used  gcc version 4.7.2 under Linux (Fedora 18).

Zbigniew

I inserted MKL_Get_Version_String as recomended. Then I compiled my program using the static MKL libraries. Here are my command lines:

setenv LD_LIBRARY_PATH /opt/intel/composer_xe_2013.4.183/compiler/lib/intel64

setenv MKLROOT /opt/intel/composer_xe_2013.4.183

gcc -I$MKLROOT/mkl/include -o zgelsd_bug zgelsd_bug.c -Wl,--start-group $MKLROOT/mkl/lib/intel64/libmkl_intel_ilp64.a $MKLROOT/mkl/lib/intel64/libmkl_intel_thread.a $MKLROOT/mkl/lib/intel64/libmkl_core.a -Wl,--end-group $MKLROOT/compiler/lib/intel64/libiomp5.so -lm -lpthread -ldl

Then I ran ./zgelsd_bug and got this output:

MKL release version:
Intel(R) Math Kernel Library Version 11.0.4 Product Build 20130517 for Intel(R) 64 architecture applications
Reading the input matrix...
Reading the input RHS...
Done

MKL INTERNAL ERROR: Condition 1 detected in function DLASD4.

MKL INTERNAL ERROR: Condition 1 detected in function DLASD8.
info = 1

Possibly I do something wrong, but the bug is still there for me. I used gcc version 4.7.2 on Linux (Fedora 18).

Zbigniew

PS. Sorry if this comment is submitted twice. It seems that my previous comment was deleted.

Dear Zbigniew,

It is quite strange. I tested the gnu

$ gcc --version

gcc (GCC) 4.7.2

Copyright (C) 2012 Free Software Foundation, Inc.

This is free software; see the source for copying conditions.  There is NO

warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 with  RedHat_6.0_x86_64  and I’ve  not seen any error messages  . I suspect that MKL 11.0.4 you have been using  was installed or packaged incorrectly. The routine you are using is inside of libmkl_core.so or libmkl_core.a and I’m inclined to think that these binaries are old.  Would you please take a look at the size and date of creation of libmkl_core.so or libmkl_core.a in $MKLROOT/mkl/lib/intel64 and let us know this information? 

Probably full reinstallation of MKL 11.0.4  with complete deletion of the current MKL 11.0.4 files might help.

 Another possible reason is that the gnu compiler was installed or configured in a wrong way. Running MKL example,  for example LAPACKE examples located in the directory   with the command

make sointel64 interface=lp64 compiler=gnu

might help to understand what is wrong.

All the best

Sergey

Hello Zbigniew, I am very sorry for misinform. This is my fault - actually the fix of the problem has not been added to update 4.
The next update would contain this fix. 

regards, Gennady

This bug is fixed in MKL 11 update 5.

Thanks,

Zbigniew

Leave a Comment

Please sign in to add a comment. Not a member? Join today