This section describes the interface to the sharedmemory multiprocessing parallel direct sparse solver known as the Intel MKL PARDISO solver. The interface is Fortran, but it can be called from C or C++ programs by observing Fortran parameter passing and naming conventions used by the supported compilers and operating systems.
The Intel MKL PARDISO package is a highperformance, robust, memory efficient, and easy to use software package for solving large sparse linear systems of equations on shared memory multiprocessors. The solver uses a combination of left and rightlooking Level3 BLAS supernode techniques [Schenk002]. To improve sequential and parallel sparse numerical factorization performance, the algorithms are based on a Level3 BLAS update and pipelining parallelism is used with a combination of left and rightlooking supernode techniques [Schenk00, Schenk01, Schenk02, Schenk03]. The parallel pivoting methods allow complete supernode pivoting to compromise numerical stability and scalability during the factorization process. For sufficiently large problem sizes, numerical experiments demonstrate that the scalability of the parallel algorithm is nearly independent of the sharedmemory multiprocessing architecture.
The following table lists the names of the Intel MKL PARDISO routines and describes their general use.
Routine  Description 

pardisoinit 
Initializes Intel MKL PARDISO with default parameters depending on the matrix type. 
pardiso 
Calculates the solution of a set of sparse linear equations with single or multiple righthand sides. 
pardiso_64 
Calculates the solution of a set of sparse linear equations with single or multiple righthand sides, 64bit integer version. 
pardiso_getenv 
Retrieves additional values from the Intel MKL PARDISO handle. 
pardiso_setenv 
Sets additional values in the Intel MKL PARDISO handle. 
mkl_pardiso_pivot 
Replaces routine which handles Intel MKL PARDISO pivots with userdefined routine. 
pardiso_getdiag 
Returns diagonal elements of initial and factorized matrix. 
pardiso_handle_store 
Store internal structures from pardiso to a file. 
pardiso_handle_restore 
Restore pardiso internal structures from a file. 
pardiso_handle_delete 
Delete files with pardiso internal structure data. 
pardiso_handle_store_64 
Store internal structures from pardiso_64 to a file. 
pardiso_handle_restore_64 
Restore pardiso_64 internal structures from a file. 
pardiso_handle_delete_64 
Delete files with pardiso_64 internal structure data. 
The Intel MKL PARDISO solver supports a wide range of real and complex sparse matrix types (see the figure below).
The Intel MKL PARDISO solver performs four tasks:
 analysis and symbolic factorization
 numerical factorization
 forward and backward substitution including iterative refinement
 termination to release all internal solver memory.
You can find code examples that use Intel MKL PARDISO routines to solve systems of linear equations in the examples folder of the Intel MKL installation directory:

Fortran examples: examples/solverf/source

C examples: examples/solverc/source
Supported Matrix Types
The analysis steps performed by Intel MKL PARDISO depend on the structure of the input matrix A.
 Symmetric Matrices

The solver first computes a symmetric fillin reducing permutation P based on either the minimum degree algorithm [Liu85] or the nested dissection algorithm from the METIS package [Karypis98] (both included with Intel MKL), followed by the parallel leftright looking numerical Cholesky factorization [Schenk002] of
PAP^{T} = LL^{T}
for symmetric positivedefinite matrices, or PAP^{T} = LDL^{T} for symmetric indefinite matrices. The solver uses diagonal pivoting, or 1x1 and 2x2 BunchKaufman pivoting for symmetric indefinite matrices. An approximation of X is found by forward and backward substitution and optional iterative refinement.Whenever numerically acceptable 1x1 and 2x2 pivots cannot be found within the diagonal supernode block, the coefficient matrix is perturbed. One or two passes of iterative refinement may be required to correct the effect of the perturbations. This restricting notion of pivoting with iterative refinement is effective for highly indefinite symmetric systems. Furthermore, for a large set of matrices from different applications areas, this method is as accurate as a direct factorization method that uses complete sparse pivoting techniques [Schenk04].
Another method of improving the pivoting accuracy is to use symmetric weighted matching algorithms. These algorithms identify large entries in the coefficient matrix A that, if permuted close to the diagonal, permit the factorization process to identify more acceptable pivots and proceed with fewer pivot perturbations. These algorithms are based on maximum weighted matchings and improve the quality of the factor in a complementary way to the alternative of using more complete pivoting techniques.
The inertia is also computed for real symmetric indefinite matrices.
 Structurally Symmetric Matrices

The solver first computes a symmetric fillin reducing permutation P followed by the parallel numerical factorization of PAP^{T} = QLU^{T}. The solver uses partial pivoting in the supernodes and an approximation of X is found by forward and backward substitution and optional iterative refinement.
 Nonsymmetric Matrices

The solver first computes a nonsymmetric permutation P_{MPS} and scaling matrices D_{r} and D_{c} with the aim of placing large entries on the diagonal to enhance reliability of the numerical factorization process [Duff99]. In the next step the solver computes a fillin reducing permutation P based on the matrix P_{MPS}A + (P_{MPS}A)^{T} followed by the parallel numerical factorization
QLUR = PP_{MPS}D_{r}AD_{c}P
with supernode pivoting matrices Q and R. When the factorization algorithm reaches a point where it cannot factor the supernodes with this pivoting strategy, it uses a pivoting perturbation strategy similar to [Li99]. The magnitude of the potential pivot is tested against a constant threshold of
alpha = eps*A2_{inf}
,where
eps
is the machine precision,A2 = P*P_{MPS}*D_{r}*A*D_{c}*P
, andA2_{inf}
is the infinity norm ofA
. Any tiny pivots encountered during elimination are set to the sign(l_{II})*eps*A2_{inf}
, which trades off some numerical stability for the ability to keep pivots from getting too small. Although many failures could render the factorization welldefined but essentially useless, in practice the diagonal elements are rarely modified for a large class of matrices. The result of this pivoting approach is that the factorization is, in general, not exact and iterative refinement may be needed.
Sparse Data Storage
Sparse data storage in Intel MKL PARDISO follows the 3array variation of the compressed sparse row (CSR3) format described in Sparse Matrix Storage Format with the Intel MKL PARDISO parameters ja standing for columns, ia for rowIndex, and a for values. The algorithms in Intel MKL PARDISO require column indices ja to be in increasing order per row and that the diagonal element in each row be present for any structurally symmetric matrix. For symmetric or nonsymmetric matrices the diagonal elements which are equal to zero are not necessary.
Caution
Intel MKL PARDISO column indices ja must be in increasing order per row. You can validate the sparse matrix structure with the matrix checker (iparm(27)
)
Note
While the presence of zero diagonal elements for symmetric matrices is not required, you should explicitly set zero diagonal elements for symmetric matrices. Otherwise, Intel MKL PARDISO creates internal copies of arrays ia, ja, and a full of diagonal elements, which require additional memory and computational time. However, the memory and time required the diagonal elements in internal arrays is usually not significant compared to the memory and the time required to factor and solve the matrix.
Storage of Matrices
By default, Intel MKL PARDISO stores matrices in RAM. However, you can specify that Intel MKL PARDISO store matrices on disk by setting iparm(60)
. This is referred to as incore (IC) and outofcore (OOC), respectively.
OOC parameters can be set in a configuration file. You can set the path to this file and its name using environmental variables MKL_PARDISO_OOC_CFG_PATH and MKL_PARDISO_OOC_CFG_FILE_NAME.
These variables specify the path and filename as follows:
<MKL_PARDISO_OOC_CFG_PATH>/<MKL_PARDISO_OOC_CFG_FILE_NAME> for Linux* OS and OS X*, and
<MKL_PARDISO_OOC_CFG_PATH>\<MKL_PARDISO_OOC_CFG_FILE_NAME> for Windows* OS.
By default, the name of the file is pardiso_ooc.cfg and it is placed in the current directory.
All temporary data files can be deleted or stored when the calculations are completed in accordance with the value of the environmental variable MKL_PARDISO_OOC_KEEP_FILE. If it is not set or if it is set to 1, then all files are deleted. If it is set to 0, then all files are stored.
By default, the OOC version of Intel MKL PARDISO uses the current directory for storing data, and all work arrays associated with the matrix factors are stored in files named ooc_temp with different extensions. These default values can be changed by using the environmental variable MKL_PARDISO_OOC_PATH.
To set the environmental variables MKL_PARDISO_OOC_MAX_CORE_SIZE, MKL_PARDISO_OOC_MAX_SWAP_SIZE, MKL_PARDISO_OOC_KEEP_FILE, and MKL_PARDISO_OOC_PATH, create the configuration file with the following lines:
MKL_PARDISO_OOC_PATH = <path>\ooc_file
MKL_PARDISO_OOC_MAX_CORE_SIZE = N
MKL_PARDISO_OOC_MAX_SWAP_SIZE = K
MKL_PARDISO_OOC_KEEP_FILE = 0 (or 1)
where <path>
is the directory for storing data, ooc_file
is the file name without any extension, N is the maximum size of RAM in megabytes available for Intel MKL PARDISO (default value is 2000 MB), K is the maximum swap size in megabytes available for Intel MKL PARDISO (default value is 0 MB). Do not set N+K greater than the size of the RAM plus the size of the swap. Be sure to allow enough free memory for the operating system and any other processes which are necessary.
Caution
The maximum length of the path lines in the configuration files is 1000 characters.
Alternatively the environment variables can be set via command line.
For Linux* OS and OS X*:
export MKL_PARDISO_OOC_PATH = <path>/ooc_file
export MKL_PARDISO_OOC_MAX_CORE_SIZE = N
export MKL_PARDISO_OOC_MAX_SWAP_SIZE = K
export MKL_PARDISO_OOC_KEEP_FILE = 0 (or 1)
For Windows* OS:
set MKL_PARDISO_OOC_PATH = <path>\ooc_file
set MKL_PARDISO_OOC_MAX_CORE_SIZE = N
set MKL_PARDISO_OOC_MAX_SWAP_SIZE = K
set MKL_PARDISO_OOC_KEEP_FILE = 0 (or 1)
Note
The values specified in a command line have higher priorities: if a variable is changed in the configuration file and in the command line, OOC version of Intel MKL PARDISO uses only the value defined in the command line.
DirectIterative Preconditioning for Nonsymmetric Linear Systems
The solver uses a combination of direct and iterative methods [Sonn89] to accelerate the linear solution process for transient simulation. Most applications of sparse solvers require solutions of systems with gradually changing values of the nonzero coefficient matrix, but with an identical sparsity pattern. In these applications, the analysis phase of the solvers has to be performed only once and the numerical factorizations are the important timeconsuming steps during the simulation. Intel MKL PARDISO uses a numerical factorization and applies the factors in a preconditioned KrylowSubspace iteration. If the iteration does not converge, the solver automatically switches back to the numerical factorization. This method can be applied to nonsymmetric matrices in Intel MKL PARDISO. You can select the method using the iparm(4) input parameter. The iparm(20) parameter returns the error status after running Intel MKL PARDISO.
Single and Double Precision Computations
Intel MKL PARDISO solves tasks using single or double precision. Each precision has its benefits and drawbacks. Double precision variables have more digits to store value, so the solver uses more memory for keeping data. But this mode solves matrices with better accuracy, which is especially important for input matrices with large condition numbers.
Single precision variables have fewer digits to store values, so the solver uses less memory than in the double precision mode. Additionally this mode usually takes less time. But as computations are made less precisely, only some systems of equations can be solved accurately enough using single precision.
Separate Forward and Backward Substitution
The solver execution step (see parameter phase = 33
below) can be divided into two or three separate substitutions: forward, backward, and possible diagonal. This separation can be explained by the examples of solving systems with different matrix types.
A real symmetric positive definite matrix A (mtype = 2
) is factored by Intel MKL PARDISO as A = L*L^{T}
. In this case the solution of the system A*x=b
can be found as sequence of substitutions: L*y=b
(forward substitution, phase =331
) and L^{T}*x=y
(backward substitution, phase =333
).
A real nonsymmetric matrix A (mtype = 11
) is factored by Intel MKL PARDISO as A = L*U
. In this case the solution of the system A*x=b
can be found by the following sequence: L*y=b
(forward substitution, phase =331
) and U*x=y
(backward substitution, phase =333
).
Solving a system with a real symmetric indefinite matrix A (mtype = 2
) is slightly different from the cases above. Intel MKL PARDISO factors this matrix as A=LDL^{T}
, and the solution of the system A*x=b
can be calculated as the following sequence of substitutions: L*y=b
(forward substitution, phase =331
), D*v=y
(diagonal substitution, phase =332
), and finally L^{T}*x=v
(backward substitution, phase =333
). Diagonal substitution makes sense only for symmetric indefinite matrices (mtype = 2, 4, 6
). For matrices of other types a solution can be found as described in the first two examples.
Caution
The number of refinement steps (iparm(8)
) must be set to zero if a solution is calculated with separate substitutions (phase = 331, 332, 333
), otherwise Intel MKL PARDISO produces the wrong result.
Note
Different pivoting (iparm(21)
) produces different LDL^{T}
factorization. Therefore results of forward, diagonal and backward substitutions with diagonal pivoting can differ from results of the same steps with BunchKaufman pivoting. Of course, the final results of sequential execution of forward, diagonal and backward substitution are equal to the results of the full solving step (phase=33
) regardless of the pivoting used.
Callback Function for Pivoting Control
Incore Intel MKL PARDISO allows you to control pivoting with a callback routine, mkl_pardiso_pivot. You can then use the pardiso_getdiag routine to access the diagonal elements. Set iparm(56)
to 1 in order to use the callback functionality.
 pardiso
 pardisoinit
 pardiso_64
 pardiso_getenv, pardiso_setenv
 mkl_pardiso_pivot
 pardiso_getdiag
 pardiso_handle_store
 pardiso_handle_restore
 pardiso_handle_delete
 pardiso_handle_store_64
 pardiso_handle_restore_64
 pardiso_handle_delete_64
 Intel MKL PARDISO Parameters in Tabular Form
 pardiso iparm Parameter
 PARDISO_DATA_TYPE