Recommended settings for calling Intel MKL routines from multi-threaded applications


Recommended settings for calling Intel MKL routines from multi-threaded1 applications.
 

Choose linking model
 

Set additional parameters2

Comments

Link with sequential threading layer of MKL.
 

N/A

MKL threading is not needed.
Example: you believe the threads of your application utilize all physical cores of the system, or MKL threading will lead to oversubscription3.
 

Link with parallel threading layer of MKL.

MKL_NUM_THREADS = 1
 


This case is equivalent to linking with sequential MKL, that is, disable threading in MKL or linking with the threaded version on MKL but call mkl_set_num_thread( 1 )

 


Use parameters below to enable MKL threading inside the threading of your application.
 

MKL_NUM_THREADS=N

Enable MKL threading - use when you are sure that there are enough resources (physical cores) for MKL threading in addition to your own threads. Choose N carefully.

Example 1:

application has 2 threads, each thread calls MKL and the system has 8 cores: it's reasonable to set MKL_NUM_THREADS=4.

Example 2:

MKL function is called from a critical section of a parallel region - set MKL_NUM_THREADS=N, where N is the number of physical cores in the system ( or use mkl_set_num_thread( N) routine ) .

NOTE:
set additional options when the application is based on OpenMP* threads.
 


MKL_DYNAMIC=false OMP_NESTED=true

omp_set_max_active_levels(2)

Apply these options if MKL is called from an OpenMP* parallel region and you want to enable both OpenMP* and MKL threading.

Example:
Calls MKL routines dsyev from one own OpenMP* parallel region with 2 OpenMP threads and 8 MKL threads.
MKL_DYNAMIC=false
OMP_NESTED=true
OMP_NUM_THREADS=2 MKL_NUM_THREADS=8
omp_set_max_active_levels(2)  control MKL to spawn only the first level of nested threads and avoid significant overhead


MKL_NUM_THREADS=1
MKL_DOMAIN_NUM_THREADS = "DOMAIN=N"


Enable MKL threading for specific MKL domains only (BLAS, FFT, VML, PARDISO).

The MKL_DOMAIN_ALL variable affects on all MKL routines.

Example:
application calls some MKL routines from different threads relying on own threading and then calls MKL FFT from the serial part.
Set MKL_NUM_THREADS=1 and MKL_DOMAIN_NUM_THREADS="MKL_FFT=N",
where N is the number of physical cores in the system.


MKL_DYNAMIC=true
This option may reduce possible oversubscription from MKL threading. This option leads to a dynamic reduction of number of OpenMP* threads based on analysis of system workload.

Available in the Intel® OpenMP* (libiomp5 library).
 

For Environment settings and best known methods of using Intel MKL on Intel® Xeon Phi™ coprocessor, refer to the Article Environment Settings on Intel® Xeon Phi™ Coprocessor

 
1. Any multi-threading environments are meant here: OpenMP*, Intel® PBB, POSIX* threads, Windows* threads etc.
 
2. An alternative way to set additional parameters is to use function calls (instead of environment variables as in this
table). Read Intel® MKL and OpenMP* documentations for more details.  All MKL routines call takes precedence over any environment variables.  and MKL environment Variables will take precedence over the OpenMP* environments.
 
3. Oversubscription is the situation when the application has more active threads than available physical cores.
That may lead to performance degradation.
 
 
For more complete information about compiler optimizations, see our Optimization Notice.