SVD hangs periodically

SVD hangs periodically

We are noticing that SVD, both dgesvd and dgesdd, will hang periodically. The call stack terminates with a call to either one of those functions. Killing the process and rerunning alleviates the problem.

We suspect there's some form of locking going on in SVD somewhere. Something we've tried is the following:

  • start a new thread to compute SVD
  • wait until finish or timeout
  • if timeout, try again
  • if timeout, throw

What was interesting about this is that the first thread will block (just like before we put the threading in, but the second thread wouldn't even start. Now, there are a limited number of resources and actions that could stop a new thread. One of them is that every thread must call into DllMain of every assembly in the process. If one of the assemblies is doing something non-standard (holding a lock), then you are dead.

We suspect there's a problem somewhere with Intel, and possibly, another library in our stack. We generally do not mess with DllMain and thread registration.

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Chris,

What MKL version, OS type, CPU type, threading model etc are you using?  You may export MKL_VERBOSE=1 and run your exactable file.

If possible, could you isolate the mkl issue and provide us one small reproduce case?

Best Regards,
Ying

Hey Ying,

MKL Version: 11.3.4

OS Type: Microsoft Windows Server 2012 Standard, 6.2.9200

CPU Type: Intel64 Family 6 Model 45 Stepping 7 GenuineIntel ~2700 Mhz, Intel64 Family 6 Model 23 Stepping 10 GenuineIntel ~3000 Mhz, Intel64 Family 6 Model 45 Stepping 7 GenuineIntel ~1188 Mhz. We have five servers, these are the processors across them all. This same process has failed on other servers as well.

ThreadModel: We've tried the default setting and MKL_THREADING_SEQUENTIAL, both set as an environment variable and through the mkl_set_threading_layer function.

It's not possible to provide a test case as this problem is intermittent on our end as well. It will hang one run, we'll kill the executable and rerun the same process and it will run without fail. I asked our users for the VERBOSE files, but they haven't got back yet. I'll update this forum when they become available. 

Thanks,

CS

Chris, could you try to evaluated the next MKL v.2017? Checking the Bug Fix List, I see several SVD related issues were fixed during MKL 2017 beta time frame.

Hi Chris, 

If the issue is still there, please submit it to  Online Service center,  http://www.intel.com/supporttickets, which is our official support channel and the related information are protected. 

Best Regards,

Ying 

Leave a Comment

Please sign in to add a comment. Not a member? Join today