Developer Guide

Contents

Improving Performance with Threading

Intel® Math Kernel Library (Intel® MKL)
is extensively parallelized. SeeOpenMP* Threaded Functions and Problems and Functions Threaded with Intel® Threading Building Blocks for lists of threaded functions and problems that can be threaded.
Intel® MKL
is
thread-safe
, which means that all
Intel® MKL
functions (except the LAPACK deprecated routine
?lacon)
work correctly during simultaneous execution by multiple threads. In particular, any chunk of threaded
Intel® MKL
code provides access for multiple threads to the same shared data, while permitting only one thread at any given time to access a shared piece of data. Therefore, you can call
Intel® MKL
from multiple threads and not worry about the function instances interfering with each other.
If you are using OpenMP* threading technology,
you can use the environment variable
OMP_NUM_THREADS
to specify the number of threads or the equivalent OpenMP run-time function calls.
Intel® MKL
also offers variables that are independent of OpenMP, such as
MKL_NUM_THREADS
, and equivalent
Intel® MKL
functions for thread management. The
Intel® MKL
variables are always inspected first, then the OpenMP variables are examined, and if neither is used, the OpenMP software chooses the default number of threads.
By default,
Intel® MKL
uses the number of
OpenMP
threads equal to the number of physical cores on the system.
If you are using the Intel TBB threading technology, the OpenMP threading controls, such as the
OMP_NUM_THREADS
environment variable or
MKL_NUM_THREADS
function, have no effect. Use the Intel TBB application programming interface to control the number of threads.
To achieve higher performance, set the number of threads to the number of processors or physical cores, as summarized in Techniques to Set the Number of Threads.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
This notice covers the following instruction sets: SSE2, SSE4.2, AVX2, AVX-512.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804