gfortran+MKL+zdotc => CRASH

gfortran+MKL+zdotc => CRASH

I have discovered a crash/bug when attempting to link a fortran program compiled using gfortran 4.7.2 against MKL in 32-bit mode.

The symptoms are that the BLAS function ZDOTC returns an incorrect result -

        ZDOTC returned  (  0.0000000000000000     ,  1.5907197451559738E-314)   Expected  (-0.75000000000000000     ,  1.0000000000000000     )

This is in fact the output of a test I inserted in my code some time ago to guard against linking with LAPACK and BLAS with an incompatible calling convention (eg g77 vs ifort) , and it is being triggered when I link a gfortran compile against MKL in 32-bit mode.

My link line contains

 -L"/opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/ia32"  -Wl,--start-group -lmkl_gf -lmkl_core -lmkl_sequential -Wl,--end-group

as recommended by the link adviser, and succeeds without any warnings.

The fault occurs with both MKL 10.3.11 and 11.0.1, and with gfortran 4.7.2, on x86_64 linux.

It does NOT happen in "intel64" mode with the same versions of the compilers and libraries. 

The error is NOT triggered when linking the same object files against either OpenBLAS or ACML.

It occurs irrespective of whether the link is with static or shared libraries.

The zdotc test itself has not given a false positive yet when used with a wide suite of compilers and libraries, so I don't think it is wrong now (and in any case the result of commenting out the check is a segfault).FYI it declares ZDOTC with the interface

    interface
       function zdotc(N, X, INCX, Y, INCY)
         use constants, only : dp
         integer :: n, incx, incy
         complex(kind=dp), dimension(*) :: X, Y
         complex(kind=dp) :: zdotc
       end function zdotc
    end interface

(where dp = 8) and simply calls it with a short vector, testing the result against the expected value.

I suspect a bug in libmkl_gf, or an incompatibility with this version of GNU fortran.  Can anyone enlighten me or confirm or deny my suspicion?

Keith Refson

7 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Can you please upload your test code and the input data to zdotc, so we can try to reproduce this problem?

Thanks.

Here is the test. 

ifort -O1 -o blas_link_test.if32 blas_link_test.f90 -mkl=sequential

$ ./blas_link_test.if32
  BLAS Link test passed

$ gfortran -O3 -o blas_link_test.gf32-acml -m32 blas_link_test.f90  -L/opt/acml4.4.0/gfortran32/lib -lacml
[kr@kohn MKL_GF_BLAS]$ ./blas_link_test.gf32-acml
  BLAS Link test passed
$ gfortran -O1 -o blas_link_test.gf32-ob -m32 blas_link_test.f90 -L/usr/local/lib32gfortran -lopenblas
[kr@kohn MKL_GF_BLAS]$ ./blas_link_test.gf32-ob
  BLAS Link test passed
$ gfortran -O3 -o blas_link_test.gf32.mkl -m32 blas_link_test.f90 -lmkl_gf -lmkl_core -lmkl_sequential
[kr@kohn MKL_GF_BLAS]$ ./blas_link_test.gf32.mkl

  C A S T E P    C O M P I L A T I O N    E R R O R    D E T E C T E D     
  C A S T E P    C O M P I L A T I O N    E R R O R    D E T E C T E D     

    An internal test has determined that this CASTEP executable is
    faulty and can not be used. The run has been ABORTED.

        ZDOTC returned  (  2.0439333122041759E-314,  1.5907197451559738E-314)   Expected  (-0.75000000000000000     ,  1.0000000000000000     )

    A test of correct functioning of ZDOTC has FAILED.  Most likely
    this is because the CASTEP executable was linked against a BLAS
    library with an incompatible calling convention.
    Please see file README.INSTALL in source code for more information.

  C A S T E P    C O M P I L A T I O N    E R R O R    D E T E C T E D     
  C A S T E P    C O M P I L A T I O N    E R R O R    D E T E C T E D     

  C A S T E P    C O M P I L A T I O N    E R R O R    D E T E C T E D     
  C A S T E P    C O M P I L A T I O N    E R R O R    D E T E C T E D     

    An internal test has determined that this CASTEP executable is
    faulty and can not be used. The run has been ABORTED.

        ZDOTC returned  (  2.0439333122041759E-314,  1.5907197451559738E-314)   Expected  (-0.75000000000000000     ,  1.0000000000000000     )

    A test of correct functioning of ZDOTC has FAILED.  
    Most likely this is because the CASTEP executable was linked
    against a BLAS library with an incompatible calling convention.
    Please see file README.INSTALL in source code for more information.

  C A S T E P    C O M P I L A T I O N    E R R O R    D E T E C T E D     
  C A S T E P    C O M P I L A T I O N    E R R O R    D E T E C T E D     

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0xF69A5C9B
#1  0xF69A62EC
#2  0xF77A13FF
#3  0x0
Segmentation fault

附件: 

附件尺寸
下载 blas-link-test.f905.24 KB

gfortran -O3 -o blas_link_test.gf32.mkl -m32 -L/opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/ia32/ blas-link-test.f90 -lmkl_gf -lmkl_core -lmkl_sequential

export LD_LIBRARY_PATH=/usr/local/gcc-4.9/lib:/opt/intel/composer_xe_2013_sp1.1.106/mkl/lib/ia32/:$LD_LIBRARY_PATH

$ ./blas_link_test.gf32.mkl
  BLAS Link test passed

Keith,

I was NOT able to reproduce your problem using either MKL 11.0.1 or the latest MKL 11.1. I used the same command line and compile/link options you used. The test passed. Here's my system configuration info:

CPU: Intel(R) Xeon(R) CPU E5540
OS: RHEL 6.0 (Linux kernel 2.6.32-71.el6.x86_64)
GLIBC version: 2.12
gfortran version: 4.4.4
Intel MKL: 11.0.1 or 11.1

Note that you used gfortran 4.7.2, but I used gfortran 4.4.4. I do not have a system with gfortran 4.7.2 readily available. But I doubt this would make a difference.

Keith,

Any chances that this is a bug in gfortran? Your test passes with -O0.

Thanks
Dima

I have just tested the most obvious possibility for the discrepancy between the behaviour you and I see. It does appear that the version of gfortran is the relevant variable.

Gfortran versions 4.3.6, 4.4.7, 4.5.3, 4.6.1, 4.8.2 and 4.9 development versions all pass the test.  Gfortran 4.7.x always fails.

(The prebuilt executable snapshots at gfortran.com are very useful for testing!)

I wonder what could be the origin of the bug, and why only the specific combination of this version of gfortran and MKL fails. It can't be as simple as generating a bad calling sequence as it succeeds when linking against ACML and OpenBLAS.  I will try the gfortran mailing list to see if anyone there can shed light on this.

K.

发表评论

登录添加评论。还不是成员?立即加入