Segmentation fault in MKL_get_N_Cores

Segmentation fault in MKL_get_N_Cores

As part of a 2D spline interpolation routine, I'm calling dgesv(). That routine is giving me a segmentation fault in MKL_get_N_Cores(). The debugger output is:

Dump of assembler code for function MKL_get_N_Cores:
0x0063f190 <+0>: push %ebx
0x0063f191 <+1>: push %esi
0x0063f192 <+2>: push %edi
0x0063f193 <+3>: push %ebp
0x0063f194 <+4>: sub $0x4ecc,%esp
0x0063f19a <+10>: call 0x63f19f
0x0063f19f <+15>: pop %edi
0x0063f1a0 <+16>: lea 0x4671c9(%edi),%edi
0x0063f1a6 <+22>: cmpl $0x1,0xa1f74(%edi)
0x0063f1ad <+29>: je 0x63f1c1
0x0063f1af <+31>: mov %edi,%ebx
0x0063f1b1 <+33>: call 0x63b820
0x0063f1b6 <+38>: mov %eax,%esi
0x0063f1b8 <+40>: cmpl $0xffffffff,0x3658(%edi)
0x0063f1bf <+47>: je 0x63f1cc
0x0063f1c1 <+49>: add $0x4ecc,%esp
0x0063f1c7 <+55>: pop %ebp
0x0063f1c8 <+56>: pop %edi
0x0063f1c9 <+57>: pop %esi
0x0063f1ca <+58>: pop %ebx
0x0063f1cb <+59>: ret

The crash occurs at the line: call . This is running on Ubuntu with g++ compiler. It was using the static libraries. I switched to the dynamic libraries and got the same fault. The 2D spline code is included in another app. I fed the same input file to the 2nd app and it works fine. I verified with the debugger that the arguments to dgesv with the two apps were identical. The app that crashes uses about 1.2GB of RAM while the app that doesn't crash uses about 100MB.

Any idea what's causing this? Or suggestions for a workaround.

Bruce

7 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Dear Bruce,

Could you please give me a testcase so that I can reproduce it and figure out and dig more into what could be the problem?

Thanks,
Sridevi

Sridevi Allam
Technical consulting engineer - Intel MKL

I'll try. I thought of another difference between the 2 apps. The app that fails is using real-time extensions (Xenomai). And the app that works is not. I'll write a small test app. If it works, I'll add a few Xenomai calls and see if it fails.

This may take a few days.

Bruce

Bruce,

Thanks for your time and efforts to create small testcase to reproduce the problem.

Just my guess however, MKL_get_N_Cores function tries to recognize CPU topology, but if Xenomai framework changesit via shadowing some CPU parameters (for example, CPU affinity) then MKLmight be confused somehow but must not crashed anyway.

Thanks,
-- Victor

Attached are 2 test cases. The one built as a linux program (mkltest)
works. The xenomai version (mklxentest) fails with the same fault.

I took the dgesv example and turned it into a function. In the xenomai
version, main() makes a couple of xenomai calls to create and run it as a
task.

Bruce

Bruce,

Nothing was attached in your previous post. Please try again.

Also, it would be helpful to add some description how to run your tests. E.g. how tocreate xenomai environment and run the second test

Thanks,
-- Victor

Found the answer. When a xenomai task is created, you tell it the
amount of stack space to allocate. From the documentation "The size of
the stack (in bytes) for the new task. If zero is passed, a reasonable
pre-defined size will be substituted." We were passing 0. When I
increased the stack size to 1MB, then the MKL calls didn't crash.

As far as I can tell the files are there. Not sure what I need to do so you can access the files.

Leave a Comment

Please sign in to add a comment. Not a member? Join today