MKL DGEMM thread safety

MKL DGEMM thread safety

 

When I run my threaded application (several threads calling Fortran subroutines that use MKL lapack function DGEMM), Im getting the "DGEMM parameter number x had an illegal value" where X could be 8, 10 ...and also 0! Im sure that Im not using shared memory among the DGEMM  calls. Could be this a heap corruption? How can I figure out what is going on (only reproduced once in a thousand execution, for example)

Thanks in advance

7 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Sorry,

Tested in linux 64 bits, with Intel Fortrans 12.1 MKL and also the last MKL. Im compiling with mkl_intel_thread and using mkl_domain_set_num_threads(0, MKL_ALL)

Why can`t I edit my own posts 10 minutes later?

Did you dynamically memory for the data passed to DGEMM?

You can try to use "Intel Inspector" to uncover many memory related bugs. You can download a 30-day fully functional trial version if you haven't purchased the license: https://software.intel.com/en-us/intel-inspector-xe

 

 

Yes,

 

I've used Inspector (in Windows) for detecting data races. It shows me some MKL data races for example in two threads running this Fortran code:

allocate(M_INV(A,A))

! FILL M_INV
!  .....

CALL DPOTRI( 'U', N, M_INV, A, INFO )  <- Data race here

 

Could be this an installation problem?

You would need to assure that each instance of the MKL call has its own threadprivate copy of the procedure arguments which differ among threads or may be modified within MKL.   Otherwise, you would need to put a critical or single around the suspect MKL call.

Yes, I know, in the case of DPOTRI, the arguments are characters, integers, numbers, and one  double array that Im allocating just before the call, so I think that there is not shared memory here.

If then I do deallocate of M_INV and another thread does allocate(M_INV), could be the runtime giving the same memory position for the allocate so the Inspector is detecting a data race?

Why is the runtime telling me that the parameter 0 is an illegal argument?

I've tried inserting all DGEMM calls (there are many more LAPACK calls like DPOTRI) inside criticals, and now there is no error.

I'm using mkl_intel_thread version. Why this is only happening with DGEMM? Inspector doesnt tell me anything about DGEMM calls..

Login to leave a comment.