Segfault when using threaded 1D DFT on AVX platforms

Segfault when using threaded 1D DFT on AVX platforms

Bild des Benutzers nicpac22

Hi,

I've recently ported some code to a 64-bit CentOS 6 server that supports AVX instructions and I think I have encountered a bug with the MKL DFT routines when threading is enabled.  When I try to take a 80640 point complex 1D forward DFT, I get a segfault if I set mkl_set_num_threads to any number greater than 1, yet the code works fine if I set mkl_set_num_threads(1).  Not sure if this has been documented or encountered by others, but for me it seems to be limited to my 64-bit AVX platform as when I compile on a 64-bit SSE4.2 platform, the code runs fine with no segfault.  I've attached the test code that I've been running to debug.  For reference, I am compiling with:

icpc -O3 -xHost test.cpp -openmp -liomp5 -lpthread -lm -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread

Here are my system stats:

Compiler: intel compserXE 2013.0.79 (MKL v11.0)

OS: 64-bit Linux CentOS 6.4

CPU: Xeon E5-2690@2.9GHz

Also, when I run the core dump through gdb, I get the following back-trace:

mkl_dft_avx_xc_4step_1_2 ()

step1234 ()

ttl_parallel_team ()

L_kmp_invoke_pass_parms ()

Is this a bug or am I just doing something wrong with my DFT?

Thanks,

Nick

AnhangGröße
Herunterladen test.cpp1.15 KB
9 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.
Bild des Benutzers Chao Y (Intel)

 Hi Nick,

Could you have a check with the latest the MKL 11.0.3 release, and see if there is any problem?  We noticed one bug in the old MKL 11.0 release on the AVX threading code on some problem sizes, and it was already fixed since MKL 11.0 update 2.   Please suggest if you still any problem with the new release.

Regards,
Chao

Bild des Benutzers Sergey Kostrov

>>...When I try to take a 80640 point complex 1D forward DFT, I get a segfault if I set mkl_set_num_threads to any number
>>greater than 1...

Could you verify what value is set for OMP_STACKSIZE environment variable?

Bild des Benutzers iliyapolak

Can you post full backtrace and register context?

Bild des Benutzers Tim Prince

[tim@tim-cp net]$ ./a.out
40381.5 40260.9

With either of the 2 most recent compiler/MKL releases

cpu family      : 6
model           : 62
model name      : Genuine Intel(R) CPU  @ 2.50GHz
stepping        : 2

(HT disabled)

Bild des Benutzers nicpac22

Chao - Thank you for the suggestion, sorry for my delay in responding, I am waiting for my sys admin to update our MKL install to 11.0 update 2 or 3 (we need explicit approval before updating software).  As soon as he does that I'll verify if the bug still occurs.

Sergey - I will check OMP_STACKSIZE tomorrow as well.

iliyapolak - Unfortunately posting the full backtrace is difficult since I don't have network access to the server I'm running on (which is at my office) and I don't have an easy way to move an electronic copy of the backtrace out of the office.  If updating MKL does not resolve the issue I'll try and post a full backtrace.

Nick

Bild des Benutzers Chao Y (Intel)

Nick,

Hope you got some chace to check the new release, feel free to update here for any progress.

Regards,
Chao

Bild des Benutzers nicpac22

Chao,

Very sorry for the delayed response, my sys admin just installed MKL 11 update 3 on friday (we have a lengthy software approval process).  I can confirm that upgrading to update 3 has fixed the problem, I am no longer able to reproduce the segfault regardless of threading or DFT size.  Thanks for your help.

Nick

Bild des Benutzers iliyapolak

@Nick

Sorry for late response

.I am glad that problem was solved:)

Melden Sie sich an, um einen Kommentar zu hinterlassen.