Overview
Note
This publication, the Intel Math Kernel Library Developer Reference, was previously known as the Intel Math Kernel Library Reference Manual.
Intel MKL is optimized for the latest Intel processors, including processors with multiple cores (see the Intel MKL Release Notes for the full list of supported processors). Intel MKL also performs well on nonIntel processors.
Note
It is your responsibility when using Intel MKL to ensure that input data has the required format and does not contain invalid characters. These can cause unexpected behavior of the library.
The library requires subroutine and function parameters to be valid before being passed. While some Intel MKL routines do limited checking of parameter errors, your application should check for NULL pointers, for example.
The Intel® Math Kernel Library includes Fortran routines and functions optimized for Intel® processorbased computers running operating systems that support multiprocessing. In addition to the Fortran interface, Intel MKL includes a Clanguage interface for the Discrete Fourier transform functions, as well as for the Vector Mathematics and Vector Statistics functions. For hardware and software requirements to use Intel MKL, see Intel® MKL Release Notes.
BLAS Routines
The BLAS routines and functions are divided into the following groups according to the operations they perform:

BLAS Level 1 Routines perform operations of both addition and reduction on vectors of data. Typical operations include scaling and dot products.

BLAS Level 2 Routines perform matrixvector operations, such as matrixvector multiplication, rank1 and rank2 matrix updates, and solution of triangular systems.

BLAS Level 3 Routines perform matrixmatrix operations, such as matrixmatrix multiplication, rankk update, and solution of triangular systems.
Starting from release 8.0, Intel® MKL also supports the Fortran 95 interface to the BLAS routines.
Starting from release 10.1, a number of BLASlike Extensions are added to enable the user to perform certain data manipulation, including matrix inplace and outofplace transposition operations combined with simple matrix arithmetic operations.
Sparse BLAS Routines
The Sparse BLAS Level 1 Routines and Functions and Sparse BLAS Level 2 and Level 3 Routines routines and functions operate on sparse vectors and matrices. These routines perform vector operations similar to the BLAS Level 1, 2, and 3 routines. The Sparse BLAS routines take advantage of vector and matrix sparsity: they allow you to store only nonzero elements of vectors and matrices. Intel MKL also supports Fortran 95 interface to Sparse BLAS routines.
LAPACK Routines
The Intel® Math Kernel Library fully supports the LAPACK 3.7 set of computational, driver, auxiliary and utility routines.
The original versions of LAPACK from which that part of Intel MKL was derived can be obtained from http://www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen.
The LAPACK routines can be divided into the following groups according to the operations they perform:

Routines for solving systems of linear equations, factoring and inverting matrices, and estimating condition numbers (see LAPACK Routines: Linear Equations).

Routines for solving least squares problems, eigenvalue and singular value problems, and Sylvester's equations (see LAPACK Routines: Least Squares and Eigenvalue Problems).
Starting from release 8.0, Intel MKL also supports the Fortran 95 interface to LAPACK computational and driver routines. This interface provides an opportunity for simplified calls of LAPACK routines with fewer required arguments.
Intel® Math Kernel Library for Deep Neural Networks (Intel® MKLDNN)
Intel® Math Kernel Library (Intel® MKL) functions for Deep Neural Networks (DNN functions) is a collection of performance primitives for Deep Neural Networks (DNN) applications optimized for Intel® architecture. The implementation of DNN functions includes a limited set of primitives used in the AlexNet topology.
The primitives implement forward and backward passes for several convolution, pooling, normalization, activation, and multidimensional transposition operations.
Intel MKL DNN primitives implement a plain C application programming interface (API) that can be used in the existing C/C++ DNN frameworks, as well as in custom DNN applications.
Sparse Solver Routines
Direct sparse solver routines in Intel MKL (see Sparse Solver Routines ) solve symmetric and symmetricallystructured sparse matrices with real or complex coefficients. For symmetric matrices, these Intel MKL subroutines can solve both positivedefinite and indefinite systems. Intel MKL includes a solver based on the PARDISO* sparse solver, referred to as Intel MKL PARDISO, as well as an alternative set of user callable direct sparse solver routines.
If you use the Intel MKL PARDISO sparse solver, please cite:
O.Schenk and K.Gartner. Solving unsymmetric sparse systems of linear equations with PARDISO. J. of Future Generation Computer Systems, 20(3):475487, 2004.
Intel MKL provides also an iterative sparse solver (see Sparse Solver Routines) that uses Sparse BLAS level 2 and 3 routines and works with different sparse data formats.
Extended Eigensolver Routines
TheExtended Eigensolver RCI Routines is a set of highperformance numerical routines for solving standard (Ax = λx) and generalized (Ax = λBx) eigenvalue problems, where A and B are symmetric or Hermitian. It yields all the eigenvalues and eigenvectors within a given search interval. It is based on the Feast algorithm, an innovative fast and stable numerical algorithm presented in [Polizzi09], which deviates fundamentally from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms [Bai00]) or other DavidsonJacobi techniques [Sleijpen96]. The Feast algorithm is inspired by the densitymatrix representation and contour integration technique in quantum mechanics.
It is free from orthogonalization procedures. Its main computational tasks consist of solving very few inner independent linear systems with multiple righthand sides and one reduced eigenvalue problem orders of magnitude smaller than the original one. The Feast algorithm combines simplicity and efficiency and offers many important capabilities for achieving high performance, robustness, accuracy, and scalability on parallel architectures. This algorithm is expected to significantly augment numerical performance in largescale modern applications.
Some of the characteristics of the Feast algorithm [Polizzi09] are:

Converges quickly in 23 iterations with very high accuracy

Naturally captures all eigenvalue multiplicities

No explicit orthogonalization procedure

Can reuse the basis of precomputed subspace as suitable initial guess for performing outerrefinement iterations
This capability can also be used for solving a series of eigenvalue problems that are close one another.

The number of internal iterations is independent of the size of the system and the number of eigenpairs in the search interval

The inner linear systems can be solved either iteratively (even with modest relative residual error) or directly
VM Functions
The Vector Mathematics functions (see Vector Mathematical Functions) include a set of highly optimized implementations of certain computationally expensive core mathematical functions (power, trigonometric, exponential, hyperbolic, etc.) that operate on vectors of real and complex numbers.
Application programs that might significantly improve performance with VM include nonlinear programming software, integrals computation, and many others. VM provides interfaces both for Fortran and C languages.
Statistical Functions
Vector Statistics (VS) contains three sets of functions (see Statistical Functions) providing: Pseudorandom, quasirandom, and nondeterministic random number generator subroutines implementing basic continuous and discrete distributions. To provide best performance, the VS subroutines use calls to highly optimized Basic Random Number Generators (BRNGs) and a set of vector mathematical functions.
 A wide variety of convolution and correlation operations.
 Initial statistical analysis of raw single and double precision multidimensional datasets.
Fourier Transform Functions
The Intel® MKL multidimensional Fast Fourier Transform (FFT) functions with mixed radix support (see Fourier Transform Functions) provide uniformity of discrete Fourier transform computation and combine functionality with ease of use. Both Fortran and C interface specification are given. There is also a cluster version of FFT functions, which runs on distributedmemory architectures and is provided only for Intel® 64 and Intel® Many Integrated Core architectures.
The FFT functions provide fast computation via the FFT algorithms for arbitrary lengths. See the Intel® MKL Developer Guide for the specific radices supported.
Partial Differential Equations Support
Intel® MKL provides tools for solving Partial Differential Equations (PDE) (see Partial Differential Equations Support). These tools are Trigonometric Transform interface routines and Poisson Solver.
The Trigonometric Transform routines may be helpful to users who implement their own solvers similar to the Intel MKL Poisson Solver. The users can improve performance of their solvers by using fast sine, cosine, and staggered cosine transforms implemented in the Trigonometric Transform interface.
The Poisson Solver is designed for fast solving of simple Helmholtz, Poisson, and Laplace problems. The Trigonometric Transform interface, which underlies the solver, is based on the Intel MKL FFT interface (refer to Fourier Transform Functions), optimized for Intel® processors.
Support Functions
The Intel® MKL support functions (see Support Functions) are used to support the operation of the Intel MKL software and provide basic information on the library and library operation, such as the current library version, timing, setting and measuring of CPU frequency, error handling, and memory allocation.
Starting from release 10.0, the Intel MKL support functions provide additional threading control.
Starting from release 10.1, Intel MKL selectively supports a Progress Routine feature to track progress of a lengthy computation and/or interrupt the computation using a callback function mechanism. The user application can define a function called mkl_progress that is regularly called from the Intel MKL routine supporting the progress routine feature. See Progress Routine in Support Functions for reference. Refer to a specific LAPACK or DSS/PARDISO function description to see whether the function supports this feature or not.
Optimization Notice 

Intel's compilers may or may not optimize to the same degree for nonIntel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessordependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 