I've run into a problem with the LAPACK error handling routine
XERBLA, using MKL on a 32-bit Windows machine with ifort 12.1
and MKL 10.3.
The test program below calls the LAPACK routine ZGETRF from MKL, and
deliberately sets the first argument M to an incorrect value, -1.
ZGETRF detects this and is supposed to call LAPACK routine
XERBLA to output an error message and then stop.
Integer m, n, lda, ipiv(2), info
a = (1.0d0,0.0d0)
m = -1
n = 2
lda = 2
End Program test
That's the theory, and in practice it does work, so long as
the program is compiled to use the default _cdecl calling convention.
Here's how I compile:
ifort test.f90 mkl_intel_c.lib mkl_sequential.lib mkl_core.lib
and this is what correctly happens at run time:
MKL ERROR: Parameter 1 was incorrect on entry to ZGETRF
Instead, I want to compile to use the CVF calling convention, like this:
ifort /iface:cvf test.f90 mkl_intel_s.lib mkl_sequential.lib mkl_core.lib
- notice that now I need to link to mkl_intel_s.lib rather than
mkl_intel_c.lib - as suggested by the MKL link line advisor tool.
With this version of the executable, I get an access violation at
forrtl: severe (157): Program Exception - access violation
Image PC Routine Line Source
test.exe 00971546 Unknown Unknown Unknown
Tracing with a debugger I see that the access violation happens
several levels deep inside XERBLA.
This experiment was done with this compiler:
Intel Visual Fortran Compiler XE for applications running on IA-32, Version 22.214.171.124 Build 20110811
Copyright (C) 1985-2011 Intel Corporation. All rights reserved.
and using MKL 10.3 that comes in the same installation package as the compiler.
I then tried compiling the same program using an older compiler version and
older MKL, and it worked fine. I used MKL 10.0.3.021 with compiler:
Intel Visual Fortran Compiler for applications running on IA-32, Version 10.1 Build 20080602 Package ID: w_fc_p_10.1.024
Copyright (C) 1985-2008 Intel Corporation. All rights reserved.
and using the same compile line as before showed no problems.
With the newer compiler and MKL, if I add my own copy of XERBLA to the
end of the test program, then the problem also goes away. Of course
I can't use that workaround if I link to the DLL version of MKL because
it's not possible to override a DLL routine in that way. In any case, we
can't tell NAG users to override XERBLA like that.
So, my suspicion is that the CVF interface to XERBLA has been broken
somehow in newer versions of MKL. I'm guessing that the access violation
may be caused by a mixup of _cdecl and CVF calling conventions. XERBLA
attempts to use the value 6, which I think is the string length argument
of 'ZGETRF', as an address.
Numerical Algorithms Group, Oxford