# Using Intel® MKL in your Python* program

Por TODD R., publicado em 10 de abril de 2011

**Introduction**

This article describes how to use the Intel® Math Kernel Library (Intel® MKL) from a Python* program. There's more than one way to write Python programs to interface with native libraries. I've simply chosen one so that I can emphasize what might be less commonly known: how to build a custom shared library from Intel MKL so that you can call it from your script.

I'll run through the basics steps of accessing Intel MKL from Python 2.6 on a 64-bit Linux* OS. The example program calls the CBLAS interface to the DGEMM function which performs a multiplication (and optional add) on general, double precision matrices. Much more about these functions can be found in the C version of the Developer Reference for Intel® MKL (available online here).*Update*: With Intel MKL 10.3 or 11.0 there is a new dynamic library which removes the need to create your own custom library. So if you're using 10.3 or later you don't need to do step 1 below. To make some changes in the behavior of this library you can look up these routines in the reference manual: mkl_set_interface_layer, mkl_set_threading_layer, mkl_set_xerbla, and mkl_set_progress.

**Build a custom library (**To interface with Intel MKL from Python we recommend you use the custom library builder in the tools/builder sub-directory of the Intel MKL package. The Intel® MKL User's Guide has documentation on this tool (docs online). Here briefly are the steps I took to do this:*now unnecessary with Intel MKL 10.3 or later*):

- Set up your environment to use the desired version of Intel MKL:
`source /`

*<MKLpath>*/tools/environment/mklvarsem64t.sh - Build the DLL:
`cd /`

*<MKLpath>*/tools/builder`make em64t name=~/libmkl4py export=cblas_list`

- Set up your environment to use the desired version of Intel MKL:
**Add library paths to LD_LIBRARY_PATH:**All the Intel MKL libraries needed must be in directories contained in the LD_LIBRARY_PATH environment variable. The library as built above will depend on the OpenMP* threading runtime library used by Intel MKL (libiomp5.so) so you should make sure that both libraries, libmkl4py.so and libiomp5.so, are in a directory specified in the LD_LIBRARY_PATH environment variable. If you're using Intel MKL 10.3 or later you need to add the directories for both libmkl_rt.so and libiomp5.so (if you want it to run on multiple cores).**Call Intel MKL in your Python script:**The following is a simple script (also available here) that loads the shared library just created and calls the matrix function.from ctypes import * # Load the share library mkl = cdll.LoadLibrary("./libmkl_rt.so") # For Intel MKL prior to version 10.3 us the created .so as below # mkl = dll.LoadLibrary("./libmkl4py.so") cblas_dgemm = mkl.cblas_dgemm def print_mat(mat, m, n): for i in xrange(0,m): print " ", for j in xrange(0,n): print mat[i*n+j], print # Initialize scalar data Order = 101 # 101 for row-major, 102 for column major data structures TransA = 111 # 111 for no transpose, 112 for transpose, and 113 for conjugate transpose TransB = 111 m = 2 n = 4 k = 3 lda = k ldb = n ldc = n alpha = 1.0 beta = -1.0 # Create contiguous space for the double precision array amat = c_double * 6 bmat = c_double * 12 cmat = c_double * 8 # Initialize the data arrays a = amat(1,2,3, 4,5,6) b = bmat(0,1,0,1, 1,0,0,1, 1,0,1,0) c = cmat(5,1,3,3, 11,4,6,9) print "nMatrix A =" print_mat(a,2,3) print "nMatrix B =" print_mat(b,3,4) print "nMatrix C =" print_mat(c,2,4) print "nCompute", alpha, "* A * B + ", beta, "* C" # Call Intel MKL by casting scalar parameters and passing arrays by reference cblas_dgemm( c_int(Order), c_int(TransA), c_int(TransB), c_int(m), c_int(n), c_int(k), c_double(alpha), byref(a), c_int(lda), byref(b), c_int(ldb), c_double(beta), byref(c), c_int(ldc)) print_mat(c,2,4) print

- A few notes:
- Matrices in the BLAS and LAPACK parts of Intel MKL are stored in one dimensional arrays and integers are used to specify their geometry.
- I've actually loaded here CBLAS interface to the general matrix multiply function which allows you to choose how the matrix is specified. In my script I've listed the matrix by rows (row-major ordering). If you do not use the cblas interface to the BLAS or if you use LAPACK you should keep in mind that these functions assume the Fortran method of listing matrices by columns (column-major ordering).

Here is the Python code I created that implements the steps above: matmult.py**Examples code:**

We extended the list of examples demonstrate how possible to call different ( not only widespread example like dgemm ) from the Python program:

See the list of 3 different examples attached:

dft.zip - shows the Python program calls 1D DFTI API

spblas.zip - shows how to call matrix-matrix multiplication routine for a sparse matrix stored in the block compressed format (BSR)

vsl.zip - shows how to call vdRngGaussian routine ( generates normally distributed random numbers) from VSL domain.

**Notes:**

Each zip file contains *.res and *_list files mean input file and file for custom library building correspondingly