Dynamic Loading Issues With MKL From SWIG Module in Python

Dynamic Loading Issues With MKL From SWIG Module in Python

I have a real head-scratcher of a problem here, and was hoping that someone can help me resolve it. The issue is to do with a fatal error generated when dynamically loading MKL from a Linux shared library, which is in turn referenced by a Python module created with the SWIG interface-generation tool.

I'm encountering this issue on an Ubuntu Linux system, using gcc 4.6.3 to compile against the version of MKL included in Parallel Studio 2016 Update 2. I'm also using Python 2.7.3 and SWIG 2.0.4.  (I can consistently reproduce the problem using the Intel C compiler, Python 2.7.11, and/or Parallel Studio 2015 as well.) I am running in an environment with all necessary environment variables set, as produced by mklvars.sh (MKLROOT is there, LD_LIBRARY_PATH includes the MKL libraries, etc.).

I've created a stripped-down example to demonstrate my issue, but it's still a little complicated, so I will explain as I go. First, we define a C library called foo in the header/source pair foo.h/foo.c. This library exposes a single function, bar(), which makes a trivial BLAS call. (First code block is foo.h, second is foo.c.)

#ifndef _FOO_H
#define _FOO_H

void bar();

#endif//_FOO_H
#include "mkl.h"

void bar() {
    double arr[1] = { 1.0 };
    cblas_daxpy(1, 1, arr, 1, arr, 1);
}

To check that this function runs without errors, we use it in a simple executable, defined in main.c:

#include "foo.h"

int main() {
    bar();
    return 0;
}

Then we create a simple SWIG interface file foo.i, allowing generation of a Python interface for the foo library:

%module foo

%{
#define SWIG_FILE_WITH_INIT
#include "foo.h"
%}

%include "foo.h"

The main executable and the Python/SWIG module can be built using gcc, with the following sequence of commands. Note that the MKL linking options are precisely as recommended by the MKL link line advisor tool. With the exception of a warning about a set-but-unused variable in the SWIG wrapper, compilation proceeds cleanly.

gcc -Wall -Wextra -O0 -fPIC -I$MKLROOT/include -c -o foo.o foo.c
gcc -Wall -Wextra -O0 -shared -L$MKLROOT/lib/intel64 -Wl,-rpath=./ -o libfoo.so foo.o -Wl,--no-as-needed -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -ldl -lpthread -lm
gcc -Wall -Wextra -O0 -L. -Wl,-rpath=./ -o main main.c -lfoo
swig -python foo.i
gcc -Wall -Wextra -O0 -fPIC -I/usr/include/python2.7 -c -o foo_wrap.o foo_wrap.c
gcc -Wall -Wextra -O0 -shared -L. -L$MKLROOT/lib/intel64 -Wl,-rpath=./ -o _foo.so foo_wrap.o -lfoo -Wl,--no-as-needed -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -ldl -lpthread -lm

After building, the main executable runs without errors. However, attempting to use the generated SWIG module from within a Python interpreter (launched from the directory containing the various outputs of the compilation process) produces the following error:

Python 2.7.3 (default, Jun 22 2015, 19:33:41) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import foo
>>> foo.bar()
Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so.

By setting LD_DEBUG=libs and trying again, I can see that the error is connected to a symbol lookup error, with the error message:

[...]/libmkl_def.so: error: symbol lookup error: undefined symbol: mkl_dft_fft_fix_twiddle_table_32f (fatal)

This symbol is defined in libmkl_core.so, which I believe everything should be linked against. The same error (or at least, the same "Intel MKL FATAL ERROR: ..." output) is reported in a post on this forum from December 2015: <https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/.... Another forum post, linked from the original (<https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/...), suggests using LD_PRELOAD to attempt to resolve the problem. Sure enough, if I open the Python interpreter with LD_PRELOAD=$MKLROOT/lib/intel64/libmkl_core.so python, the call to foo.bar() executes without issue. (Attempting to preload libmkl_avx2.so or libmkl_def.so without libmkl_core.so produces a symbol lookup error for the exact same symbol as before.)

So, the question is: can anybody suggest why this is happening, and hopefully suggest a fix that does not involve LD_PRELOAD? (We can't ship code that requires LD_PRELOAD...) My first thought was that this was a Python issue, but I'm not sure -- the other forum post reporting this problem was in relation to a tool called FuPerMod, which (from a quick look at the relevant git repo) doesn't seem to make any use of Python at all...

11 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi J.B. 

Thanks for the details and investigation.  

It is good to know "LD_PRELOAD=$MKLROOT/lib/intel64/libmkl_core.so python, the call to foo.bar() executes without issue"

So it can be one workaround, right?  would you like to try other workaround we had mentioned

1) Is it possible for link those MKL with static way?  so avoid the cascaded dynamic link?

2) or  if you only use several MKL functions in your big project,  there is tool in MKL install folder: custom dll, which can build your own dll based on MKL static library,  thus, you only need distribute your own dll, don't need to take care of MKL dll. ?

Then let's consider the root cause, 

libmkl_avx2.so  is specific optimized dll for the CPU type which support AVX2 instruction. 

 mkl_core.so will detect the cpu type and  dispatch processor-specific code at runtime.  It use the dll load method  like dlopen("libmkl_avx2.so ",RTLD_LAZY);  It is not issue when use MKL in main executable. which know the LD_LIBRARY_PATH. 

So the issue only happened when in other environment, for example, python. So the question is how to make sure the python to know the LD_LIBRARY_PATH or pass such information by your dynamic foo.so , LD_PRELOAD was a way.  But as you mentioned it is not good way, left the open questions: is there any other ways?  

Best Regards,

Ying

P.S one more small issue about gcc and MKL, the link advisor give : 

 -fopenmp -m64 -I${MKLROOT}/include

 -Wl,--no-as-needed -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_gnu_thread -lpthread -lm -ldl.  

not -lmkl_gf_lp64

Hi Ying,

Thanks for the quick response!

1) Linking against the static libraries is possible, and it does solve the issue. Unfortunately, it is highly undesirable. There are two main reasons for this. Firstly, the extra size of the various binaries produced with static linking is not inconsiderable, and would be better avoided if at all possible (from the quick tests I've done, around 20% at the moment, possibly higher). Secondly, we're using CMake for build/deployment, and handling static linking under our framework is fiddly and unlikely to work well in a cross-platform environment. Dynamic linking is by far the easiest option for us to manage.

2) A quick answer to a question you haven't yet asked, but probably will ;). This issue can be avoided by linking against the SDL rather than explicitly linking against the various dynamic libraries. However, this is not an acceptable solution, because our code will eventually be linking against BLACS/ScaLAPACK as well, which seemed to be completely incompatible with the SDL (at least under Parallel Studio 2015). I could go into this in more detail but I think it's an issue for another forum thread, perhaps.

I also suspect (but could be wrong, and would be interested to know if so) that using the SDL removes some of the opportunities for IPO at the link phase of compilation when using the Intel compiler suite. Obviously, squeezing out as much performance is possible is a good thing.

3) I'm pretty good at Python from a development perspective, but I'm definitely not an expert when it comes to how it works under the hood. I don't know if Python plays its own tricks with dynamic loading, although that might seem like one potential cause of the issue. Nevertheless, to a user perspective, Python definitely exposes system environment variables as expected, without change. For example, executing the following in a Python interpreter:

> python 
Python 2.7.3 (default, Jun 22 2015, 19:33:41)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.environ['LD_LIBRARY_PATH']

produces the same output as a simple `echo $LD_LIBRARY_PATH` on the command line.

I notice that Intel now supports an MKL-enabled Python distribution (something I've had to build myself in the past). Perhaps the devs working on that distribution have some idea about this issue? If it is indeed an inherent problem with Python, they've surely run across it.

4) Not to belabour the point, but while the LD_PRELOAD hack works, it is unacceptable for us from a deployment perspective. We do not have enough control over the environment our software is used in to enforce it. Also, I personally think it's pretty inelegant and dubious to use in production, especially if it's not clear why precisely it works.

5) Perhaps you're using a different version or something (or perhaps I am specifying something incorrectly) but the output from the link line advisor tool is indeed as stated/used in my example. I used the interactive option at first, but by explicitly specifying options on the command line, I get:

> $MKLROOT/tools/mkl_link_tool -libs -opts -env --compiler=gnu_c --arch=intel64 --linking=dynamic --parallel=yes --interface=lp64 --openmp=gomp

       Intel(R) Math Kernel Library (Intel(R) MKL) Link Tool v4.0
       ==========================================================

Output
======

Compiler option(s):
 -m64 -I$(MKLROOT)/include

Linking line:
 -L$(MKLROOT)/lib/intel64 -Wl,--no-as-needed -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -ldl -lpthread -lm

Environment variable(s):
export LD_LIBRARY_PATH=$(MKLROOT)/lib/intel64:$LD_LIBRARY_PATH;

If I'm doing something stupid here, I would be very happy to hear about it. :)

Hi J.B, 

Thanks for a lot for the detailed reply.  

I have escalated the problem to our build team

Right,  Intel now supports an MKL-enabled Python distribution, the pacakge can be apply from 

https://software.intel.com/en-us/python-distribution

It adds new Python packages like scikit-learn, mpi4py, numba, conda, tbb (Python interfaces to Intel® Threading Building Blocks) and pyDAAL (Python interfaces to Intel® Data Analytics Acceleration Library). The Beta also delivers performance improvements for NumPy/SciPy through linking with performance libraries like Intel® MKL, Intel® Message Passing Interface (Intel® MPI), Intel® TBB and Intel® DAAL.

I will check with the team and see if they have more ideas about this. 

Best Regards,

Ying

 

Hi j.B.,

this is Frank from the Intel(R) Python* team.

We had seen similar issues with other libraries. Even though it might not apply here, let me describe it - just in case.

The root cause had been that the wrong version of the lib (e.g. from a different path) had been loaded even though LD_LIBRARY_PATH was properly set. Even ldd reported using the libs from the right paths. If you have a second MKL install on your system and your python executable or your SWIG generated library set rpath, this rpath will be preferred (and ldd silently ignores it). Setting LD_DEBUG=symbols and very closely looking at reported paths disclosed the issue.

It'd be interesting to see if Intel(R) Python* shows the same issue. Maybe it solves the issue for you?

frank

Hi Frank,

Thanks for having a look at this one and for your suggestions. :)

Your idea is interesting and might make sense! We do have multiple versions of MKL available on our system, corresponding to different versions of Parallel Studio, although to my knowledge they should all be compartmentalised -- I've hand-written module files for them, and nothing (I think) is installed in a system path (/usr/lib, etc.), so it would be normally unlikely that they would collide. (Correct use of the module files should limit, e.g., the LD_LIBRARY_PATH to only include one version of MKL.)

However, I generally use a virtual environment containing hand-built versions of numpy and scipy, linked against an older version of MKL. Since I'm neither explicitly or implicitly importing either of those packages (or anything that depends on them), at least in the stripped-down example I originally gave, I wouldn't have expected that to be an issue. (Also, I thought I'd tried to use a brand-new venv with a cleanly-compiled version of Python 2.7.11...) But I guess I can't rule it out, and it would be worth another look...

I will try over the next day or two to isolate my problem as much as possible and avoid any possibility of cross-contamination with other libraries. Watch this space... :)

Also, I'd be quite interested to have a go with the Intel Python distribution, and see what happens. Unfortunately I don't think we have a license for it, and it wouldn't be a solution for us anyway (we can't ask our clients to use it, basically). But perhaps there's a trial version or something available? I guess you or someone at Intel can harvest my email from my profile -- feel free to contact me directly there if you like.

Thanks again!

After Parallel Studio beta ends, Intel Python will be available under the same community licensing as MKL and will have redistribution rights so licensing should not be a problem.

Hello,

I'm bumping this thread because I'm having the same issue as the original poster. I'm running a python application that calls custom libraries (written in C/C++) that are compiled with icc and linked against the MKL libraries available on our system. When I run on Westmere hardware, the application fails with:

Intel MKL FATAL ERROR: Cannot load libmkl_mc3.so or libmkl_def.so.

Sure enough, when run with LD_DEBUG=libs, I get the same symbol lookup error as OP:

   1166740:    /opt/intel/composer_xe_2015.5.223/mkl/lib/intel64/libmkl_mc3.so: error: symbol lookup error: undefined symbol: mkl_dft_fft_fix_twiddle_table_32f (fatal)

and

   1166740:    /opt/intel/composer_xe_2015.5.223/mkl/lib/intel64/libmkl_def.so: error: symbol lookup error: undefined symbol: mkl_dft_fft_fix_twiddle_table_32f (fatal)

Like the OP, I can run with LD_PRELOAD=...../libmkl_core.so to get around the issue, but this is NOT an option for production code or for our workflow. Confusingly, it looks like (also from the LD_DEBUG=libs output) that ldd is (seemingly? please correct me if I'm wrong; I'm still trying to figure this out as well) able to find and link against libmkl_core.so in the same directory as libmkl_mc3.so and libmkl_def.so:

...

...

   1166740:      trying file=/opt/intel/composer_xe_2015.5.223/mkl/lib/intel64/libmkl_core.so
   1166740:    calling init: /opt/intel/composer_xe_2015.5.223/mkl/lib/intel64/libmkl_core.so
   1166740:    calling fini: /opt/intel/composer_xe_2015.5.223/mkl/lib/intel64/libmkl_core.so [0]

Was there any resolution to this issue other than trying Intel Python and seeing what happens? Are there environment or compilation variables that I should be including when building my application to explicitly force icc to link against the correct MKL libraries? My LD_LIBRARY_PATH and LFLAGS all point to the same directories, in the correct order as far as I can tell. 

Thanks for your help, and please let me know if I can provide any other information. 

-Alex

Hi Alex and all,

When you build custom so with icc and mkl?  Would you please use  libmkl_rt.so to replace the -mkl  or -lmkl_intel_lp64  -lmkl_sequential -lmkl_core or -lmkl_intel_lp64   -lmkl_intel_thread -lmkl_core etc  and let us know if it works? 

We are reinvestigate the issue recently.  It seems the problem is still in cross-reference structure of MKL libraries and python ld mechanism.
for example, in python run-time,  it opens  libmkl_intel_lp.so ->  libmkl_sequential.so -> libmkl_core.so -> libmkl_avx2(mc2.so) ,  mkl_avx2(mc2.so) back  ask mkl_sequential.so etc.

We will update here if any news.

  /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_core.so: error: symbol lookup error: undefined symbol: COIProcessLoadSinkLibraryFromFile (fatal)
    238161:     /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_avx512.so: error: symbol lookup error: undefined symbol: mkl_sparse_optimize_bsr_trsm_i8 (fatal), which is defined in libmkl_sequential.so

Best Regards,

Ying

FYI I see the same kind of issue when generating a JAVA api using swig : some mkl libs must be LD_PRELOAD.

 

Since this got bumped...

As of the 2019 versions of MKL, this is absolutely still an issue. As well as in various in-house code, I have seen it pop up as a problem in multiple open-source packages. (And it's explicitly acknowledged in some, including dodgy workaround hacks to avoid it.)

Does Intel have a fix for this yet?

Leave a Comment

Please sign in to add a comment. Not a member? Join today