This publication, the Intel Math Kernel Library Developer Reference, was previously known as the Intel Math Kernel Library Reference Manual.
Intel MKL is optimized for the latest Intel processors, including processors with multiple cores (see the Intel MKL Release Notes for the full list of supported processors). Intel MKL also performs well on non-Intel processors.
It is your responsibility when using Intel MKL to ensure that input data has the required format and does not contain invalid characters. These can cause unexpected behavior of the library.
The library requires subroutine and function parameters to be valid before being passed. While some Intel MKL routines do limited checking of parameter errors, your application should check for NULL pointers, for example.
The Intel® Math Kernel Library includes Fortran routines and functions optimized for Intel® processor-based computers running operating systems that support multiprocessing. In addition to the Fortran interface, Intel MKL includes a C-language interface for the Discrete Fourier transform functions, as well as for the Vector Mathematics and Vector Statistics functions. For hardware and software requirements to use Intel MKL, see Intel® MKL Release Notes.
The BLAS routines and functions are divided into the following groups according to the operations they perform:
BLAS Level 1 Routines perform operations of both addition and reduction on vectors of data. Typical operations include scaling and dot products.
BLAS Level 2 Routines perform matrix-vector operations, such as matrix-vector multiplication, rank-1 and rank-2 matrix updates, and solution of triangular systems.
BLAS Level 3 Routines perform matrix-matrix operations, such as matrix-matrix multiplication, rank-k update, and solution of triangular systems.
Starting from release 8.0, Intel® MKL also supports the Fortran 95 interface to the BLAS routines.
Starting from release 10.1, a number of BLAS-like Extensions are added to enable the user to perform certain data manipulation, including matrix in-place and out-of-place transposition operations combined with simple matrix arithmetic operations.
Sparse BLAS Routines
The Sparse BLAS Level 1 Routines and Functions and Sparse BLAS Level 2 and Level 3 Routines routines and functions operate on sparse vectors and matrices. These routines perform vector operations similar to the BLAS Level 1, 2, and 3 routines. The Sparse BLAS routines take advantage of vector and matrix sparsity: they allow you to store only non-zero elements of vectors and matrices. Intel MKL also supports Fortran 95 interface to Sparse BLAS routines.
The Intel® Math Kernel Library fully supports the LAPACK 3.7 set of computational, driver, auxiliary and utility routines.
The original versions of LAPACK from which that part of Intel MKL was derived can be obtained from http://www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen.
The LAPACK routines can be divided into the following groups according to the operations they perform:
Routines for solving systems of linear equations, factoring and inverting matrices, and estimating condition numbers (see LAPACK Routines: Linear Equations).
Routines for solving least squares problems, eigenvalue and singular value problems, and Sylvester's equations (see LAPACK Routines: Least Squares and Eigenvalue Problems).
Starting from release 8.0, Intel MKL also supports the Fortran 95 interface to LAPACK computational and driver routines. This interface provides an opportunity for simplified calls of LAPACK routines with fewer required arguments.
The ScaLAPACK package (provided only for Intel® 64 and Intel® Many Integrated Core architectures, see ScaLAPACK Routines ) runs on distributed-memory architectures and includes routines for solving systems of linear equations, solving linear least squares problems, eigenvalue and singular value problems, as well as performing a number of related computational tasks.
The original versions of ScaLAPACK from which that part of Intel MKL was derived can be obtained from http://www.netlib.org/scalapack/index.html. The authors of ScaLAPACK are L. Blackford, J. Choi, A.Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K.Stanley, D. Walker, and R. Whaley.
The Intel MKL version of ScaLAPACK is optimized for Intel® processors and uses MPICH version of MPI as well as Intel MPI.
The PBLAS routines perform operations with distributed vectors and matrices.
PBLAS Level 1 Routines perform operations of both addition and reduction on vectors of data. Typical operations include scaling and dot products.
PBLAS Level 2 Routines perform distributed matrix-vector operations, such as matrix-vector multiplication, rank-1 and rank-2 matrix updates, and solution of triangular systems.
PBLAS Level 3 Routines perform distributed matrix-matrix operations, such as matrix-matrix multiplication, rank-k update, and solution of triangular systems.
Intel MKL provides the PBLAS routines with interface similar to the interface used in the Netlib PBLAS (part of the ScaLAPACK package, see http://www.netlib.org/scalapack/html/pblas_qref.html).
Sparse Solver Routines
Direct sparse solver routines in Intel MKL (see Sparse Solver Routines ) solve symmetric and symmetrically-structured sparse matrices with real or complex coefficients. For symmetric matrices, these Intel MKL subroutines can solve both positive-definite and indefinite systems. Intel MKL includes a solver based on the PARDISO* sparse solver, referred to as Intel MKL PARDISO, as well as an alternative set of user callable direct sparse solver routines.
If you use the Intel MKL PARDISO sparse solver, please cite:
O.Schenk and K.Gartner. Solving unsymmetric sparse systems of linear equations with PARDISO. J. of Future Generation Computer Systems, 20(3):475-487, 2004.
Intel MKL provides also an iterative sparse solver (see Sparse Solver Routines) that uses Sparse BLAS level 2 and 3 routines and works with different sparse data formats.
Extended Eigensolver Routines
TheExtended Eigensolver RCI Routines is a set of high-performance numerical routines for solving standard (Ax = λx) and generalized (Ax = λBx) eigenvalue problems, where A and B are symmetric or Hermitian. It yields all the eigenvalues and eigenvectors within a given search interval. It is based on the Feast algorithm, an innovative fast and stable numerical algorithm presented in [Polizzi09], which deviates fundamentally from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms [Bai00]) or other Davidson-Jacobi techniques [Sleijpen96]. The Feast algorithm is inspired by the density-matrix representation and contour integration technique in quantum mechanics.
It is free from orthogonalization procedures. Its main computational tasks consist of solving very few inner independent linear systems with multiple right-hand sides and one reduced eigenvalue problem orders of magnitude smaller than the original one. The Feast algorithm combines simplicity and efficiency and offers many important capabilities for achieving high performance, robustness, accuracy, and scalability on parallel architectures. This algorithm is expected to significantly augment numerical performance in large-scale modern applications.
Some of the characteristics of the Feast algorithm [Polizzi09] are:
Converges quickly in 2-3 iterations with very high accuracy
Naturally captures all eigenvalue multiplicities
No explicit orthogonalization procedure
Can reuse the basis of pre-computed subspace as suitable initial guess for performing outer-refinement iterations
This capability can also be used for solving a series of eigenvalue problems that are close one another.
The number of internal iterations is independent of the size of the system and the number of eigenpairs in the search interval
The inner linear systems can be solved either iteratively (even with modest relative residual error) or directly
The Vector Mathematics functions (see Vector Mathematical Functions) include a set of highly optimized implementations of certain computationally expensive core mathematical functions (power, trigonometric, exponential, hyperbolic, etc.) that operate on vectors of real and complex numbers.
Application programs that might significantly improve performance with VM include nonlinear programming software, integrals computation, and many others. VM provides interfaces both for Fortran and C languages.
Statistical FunctionsVector Statistics (VS) contains three sets of functions (see Statistical Functions) providing:
- Pseudorandom, quasi-random, and non-deterministic random number generator subroutines implementing basic continuous and discrete distributions. To provide best performance, the VS subroutines use calls to highly optimized Basic Random Number Generators (BRNGs) and a set of vector mathematical functions.
- A wide variety of convolution and correlation operations.
- Initial statistical analysis of raw single and double precision multi-dimensional datasets.
Fourier Transform Functions
The Intel® MKL multidimensional Fast Fourier Transform (FFT) functions with mixed radix support (see Fourier Transform Functions) provide uniformity of discrete Fourier transform computation and combine functionality with ease of use. Both Fortran and C interface specification are given. There is also a cluster version of FFT functions, which runs on distributed-memory architectures and is provided only for Intel® 64 and Intel® Many Integrated Core architectures.
The FFT functions provide fast computation via the FFT algorithms for arbitrary lengths. See the Intel® MKL Developer Guide for the specific radices supported.
Partial Differential Equations Support
Intel® MKL provides tools for solving Partial Differential Equations (PDE) (see Partial Differential Equations Support). These tools are Trigonometric Transform interface routines and Poisson Solver.
The Trigonometric Transform routines may be helpful to users who implement their own solvers similar to the Intel MKL Poisson Solver. The users can improve performance of their solvers by using fast sine, cosine, and staggered cosine transforms implemented in the Trigonometric Transform interface.
The Poisson Solver is designed for fast solving of simple Helmholtz, Poisson, and Laplace problems. The Trigonometric Transform interface, which underlies the solver, is based on the Intel MKL FFT interface (refer to Fourier Transform Functions), optimized for Intel® processors.
Nonlinear Optimization Problem Solvers
Intel® MKL provides Nonlinear Optimization Problem Solver routines (see Nonlinear Optimization Problem Solvers) that can be used to solve nonlinear least squares problems with or without linear (bound) constraints through the Trust-Region (TR) algorithms and compute Jacobi matrix by central differences.
The Intel® MKL support functions (see Support Functions) are used to support the operation of the Intel MKL software and provide basic information on the library and library operation, such as the current library version, timing, setting and measuring of CPU frequency, error handling, and memory allocation.
Starting from release 10.0, the Intel MKL support functions provide additional threading control.
Starting from release 10.1, Intel MKL selectively supports a Progress Routine feature to track progress of a lengthy computation and/or interrupt the computation using a callback function mechanism. The user application can define a function called mkl_progress that is regularly called from the Intel MKL routine supporting the progress routine feature. See Progress Routine in Support Functions for reference. Refer to a specific LAPACK or DSS/PARDISO function description to see whether the function supports this feature or not.
The Intel® Math Kernel Library implements routines from the BLACS (Basic Linear Algebra Communication Subprograms) package (see BLACS Routines) that are used to support a linear algebra oriented message passing interface that may be implemented efficiently and uniformly across a large range of distributed memory platforms.
The original versions of BLACS from which that part of Intel MKL was derived can be obtained from http://www.netlib.org/blacs/index.html. The authors of BLACS are Jack Dongarra and R. Clint Whaley.
Data Fitting Functions
The Data Fitting component includes a set of highly-optimized implementations of algorithms for the following spline-based computations:
- spline construction
- interpolation including computation of derivatives and integration
The algorithms operate on single and double vector-valued functions set in the points of the given partition. You can use Data Fitting algorithms in applications that are based on data approximation. See Data Fitting Functions for more information.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804