Developer Reference

Contents

Overview

This publication, the
Intel®
Math Kernel Library
Developer Reference
, was previously known as the
Intel®
Math Kernel Library
Reference Manual
.
Intel®
Math Kernel Library (
Intel® MKL
) is optimized for performance on Intel processors.
Intel® MKL
also runs on non-Intel x86-compatible processors.
Intel® MKL
provides limited input validation to minimize the performance overheads. It is your responsibility when using
Intel® MKL
to ensure that input data has the required format and does not contain invalid characters. These can cause unexpected behavior of the library. Examples of the inputs that may result in unexpected behavior:
  • Not-a-number (NaN) and other special floating point values
  • Large inputs may lead to accumulator overflow
As the
Intel® MKL
API accepts raw pointers, it is your application's responsibility to validate the buffer sizes before passing them to the library. The library requires subroutine and function parameters to be valid before being passed. While some
Intel® MKL
routines do limited checking of parameter errors, your application should check for NULL pointers, for example.
The
Intel®
Math Kernel Library
includes Fortran routines and functions optimized for Intel® processor-based computers running operating systems that support multiprocessing. In addition to the Fortran interface,
Intel® MKL
includes a C-language interface for the Discrete Fourier transform functions, as well as for the Vector Mathematics and Vector Statistics functions. For hardware and software requirements to use
Intel® MKL
, see
Intel® MKL
Release Notes
.
Functions calls at runtime for
Intel® MKL
libraries on the Microsoft Windows* operating system can utilize the function,
LoadLibrary()
, and related loading functions in static, dynamic, and single dynamic library linking models. These functions attempt to access the loader lock which when used within or at the same time as another
DllMain
function call, can lead to a deadlock. If possible, avoid making your calls to
Intel® MKL
in a
DllMain
function or at the same time as other calls to
DllMain
even on separate threads. Refer to Microsoft documentation about
DllMain
and
Dynamic-Link Library Best Practices
for more details.

BLAS Routines

The BLAS routines and functions are divided into the following groups according to the operations they perform:
  • BLAS Level 1 Routines perform operations of both addition and reduction on vectors of data. Typical operations include scaling and dot products.
  • BLAS Level 2 Routines perform matrix-vector operations, such as matrix-vector multiplication, rank-1 and rank-2 matrix updates, and solution of triangular systems.
  • BLAS Level 3 Routines perform matrix-matrix operations, such as matrix-matrix multiplication, rank-k update, and solution of triangular systems.
Starting from release 8.0,
Intel® MKL
also supports the Fortran 95 interface to the BLAS routines.
Starting from release 10.1, a number of BLAS-like Extensions are added to enable the user to perform certain data manipulation, including matrix in-place and out-of-place transposition operations combined with simple matrix arithmetic operations.

Sparse BLAS Routines

The Sparse BLAS Level 1 Routines and Functions and Sparse BLAS Level 2 and Level 3 Routinesroutines and functions operate on sparse vectors and matrices. These routines perform vector operations similar to the BLAS Level 1, 2, and 3 routines. The Sparse BLAS routines take advantage of vector and matrix sparsity: they allow you to store only non-zero elements of vectors and matrices.
Intel® MKL
also supports Fortran 95 interface to Sparse BLAS routines.

Sparse QR

Sparse QRin
Intel® MKL
is a set of routines used to solve sparse matrices with real coefficients and general structure. All Sparse QR routines can be divided into three steps: reordering, factorization, and solving. Currently, only CSR format is supported for the input matrix, and Sparse QR operates on the matrix handle used in all SpBLAS IE routines. (For details on how to create a matrix handle, refer tomkl-sparse-create-csr.)

LAPACK Routines

The
Intel®
Math Kernel Library
fully supports the LAPACK 3.7 set of computational, driver, auxiliary and utility routines.
The original versions of LAPACK from which that part of
Intel® MKL
was derived can be obtained fromhttp://www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen.
The LAPACK routines can be divided into the following groups according to the operations they perform:
Starting from release 8.0,
Intel® MKL
also supports the Fortran 95 interface to LAPACK computational and driver routines. This interface provides an opportunity for simplified calls of LAPACK routines with fewer required arguments.

Sparse Solver Routines

Direct sparse solver routines in
Intel® MKL
(see
Sparse Solver Routines
) solve symmetric and symmetrically-structured sparse matrices with real or complex coefficients. For symmetric matrices, these
Intel® MKL
subroutines can solve both positive-definite and indefinite systems.
Intel® MKL
includes a solver based on the PARDISO* sparse solver, referred to as
Intel® MKL
PARDISO, as well as an alternative set of user callable direct sparse solver routines.
If you use the
Intel® MKL
PARDISO sparse solver, please cite:
O.Schenk and K.Gartner. Solving unsymmetric sparse systems of linear equations with PARDISO. J. of Future Generation Computer Systems, 20(3):475-487, 2004.
Intel® MKL
provides also an iterative sparse solver (see
Sparse Solver Routines
) that uses Sparse BLAS level 2 and 3 routines and works with different sparse data formats.

Extended Eigensolver Routines

TheExtended Eigensolver RCI Routines is a set of high-performance numerical routines for solving standard (
A
x
=
λ
x
) and generalized (
A
x
=
λ
B
x
) eigenvalue problems, where
A
and
B
are symmetric or Hermitian. It yields all the eigenvalues and eigenvectors within a given search interval. It is based on the Feast algorithm, an innovative fast and stable numerical algorithm presented in [Polizzi09], which deviates fundamentally from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms [Bai00]) or other Davidson-Jacobi techniques [Sleijpen96]. The Feast algorithm is inspired by the density-matrix representation and contour integration technique in quantum mechanics.
It is free from orthogonalization procedures. Its main computational tasks consist of solving very few inner independent linear systems with multiple right-hand sides and one reduced eigenvalue problem orders of magnitude smaller than the original one. The Feast algorithm combines simplicity and efficiency and offers many important capabilities for achieving high performance, robustness, accuracy, and scalability on parallel architectures. This algorithm is expected to significantly augment numerical performance in large-scale modern applications.
Some of the characteristics of the Feast algorithm [Polizzi09] are:
  • Converges quickly in 2-3 iterations with very high accuracy
  • Naturally captures all eigenvalue multiplicities
  • No explicit orthogonalization procedure
  • Can reuse the basis of pre-computed subspace as suitable initial guess for performing outer-refinement iterations
    This capability can also be used for solving a series of eigenvalue problems that are close one another.
  • The number of internal iterations is independent of the size of the system and the number of eigenpairs in the search interval
  • The inner linear systems can be solved either iteratively (even with modest relative residual error) or directly

VM Functions

The Vector Mathematics functions (see
Vector Mathematical Functions
) include a set of highly optimized implementations of certain computationally expensive core mathematical functions (power, trigonometric, exponential, hyperbolic, etc.) that operate on vectors of real and complex numbers.
Application programs that might significantly improve performance with VM include nonlinear programming software, integrals computation, and many others. VM provides interfaces both for Fortran and C languages.

Statistical Functions

Vector Statistics (VS) contains three sets of functions (see
Statistical Functions
) providing:
  • Pseudorandom, quasi-random, and non-deterministic random number generator subroutines implementing basic continuous and discrete distributions. To provide best performance, the VS subroutines use calls to highly optimized Basic Random Number Generators (BRNGs) and a set of vector mathematical functions.
  • A wide variety of convolution and correlation operations.
  • Initial statistical analysis of raw single and double precision multi-dimensional datasets.

Fourier Transform Functions

The
Intel® MKL
multidimensional Fast Fourier Transform (FFT) functions with mixed radix support (see
Fourier Transform Functions
) provide uniformity of discrete Fourier transform computation and combine functionality with ease of use. Both Fortran and C interface specification are given. There is also a cluster version of FFT functions, which runs on distributed-memory architectures and is provided
only for Intel® 64 and Intel® Many Integrated Core architectures
.
The FFT functions provide fast computation via the FFT algorithms for arbitrary lengths. See
the
Intel® MKL
Developer Guide
for the specific radices supported.

Partial Differential Equations Support

Intel® MKL
provides tools for solving Partial Differential Equations (PDE) (see
Partial Differential Equations Support
). These tools are Trigonometric Transform interface routines and Poisson Solver.
The Trigonometric Transform routines may be helpful to users who implement their own solvers similar to the
Intel® MKL
Poisson Solver. The users can improve performance of their solvers by using fast sine, cosine, and staggered cosine transforms implemented in the Trigonometric Transform interface.
The Poisson Solver is designed for fast solving of simple Helmholtz, Poisson, and Laplace problems. The Trigonometric Transform interface, which underlies the solver, is based on the
Intel® MKL
FFT interface (refer to
Fourier Transform Functions
), optimized for Intel® processors.

Support Functions

The
Intel® MKL
support functions (see
Support Functions
) are used to support the operation of the
Intel® MKL
software and provide basic information on the library and library operation, such as the current library version, timing, setting and measuring of CPU frequency, error handling, and memory allocation.
Starting from release 10.0, the
Intel® MKL
support functions provide additional threading control.
Starting from release 10.1,
Intel® MKL
selectively supports a
Progress Routine
feature to track progress of a lengthy computation and/or interrupt the computation using a callback function mechanism. The user application can define a function called
mkl_progress
that is regularly called from the
Intel® MKL
routine supporting the progress routine feature. See
Progress Routine
in
Support Functions
for reference. Refer to a specific LAPACK or DSS/PARDISO function description to see whether the function supports this feature or not.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
This notice covers the following instruction sets: SSE2, SSE4.2, AVX2, AVX-512.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804