Introduction to the Intel MKL Extended Eigensolver
By Zhang, Zhang (Intel), published on February 12, 2013
Intel® MKL 11.0 Update 2 introduced a new component called Extended Eigensolver routines. These routines solve standard and generalized Eigenvalue problems for symmetric/Hermitian and symmetric/Hermitian positive definite sparse matrices. Specifically, these routines computes all the Eigenvalues and the corresponding Eigenvectors within a given search interval [λ_{min}, λ_{max}]:
- Ax = λx, and
- Ax = λBx
where A is a real symmetric or complex Hermitian matrix and B is a real symmetric positive definite or complex Hermitian positive definite matrix. Eigenvalue problems for dense and band matrices are also supported, although solvers for these matrices have long existed in LAPACK. The Extended Eigensolver is intended to complement the Eigenvalue solvers in LAPACK, which lacks support for sparse matrices.
The Extended Eigensolver is an implementation of the FEAST Eigenvalue Solver. The interfaces are compatible with FEAST Eigenvalue Solver v2.0. The FEAST algorithm is fundamentally different than the traditional Krylov subspace iteration based algorithms. It uses a numerically efficient contour integration technique to obtain Eigenpairs within the search interval. It has advantages such as the ability to capture all multiplicities, fast convergence, being insensitive to spectrum structures, and so on.
Main features
- Highly tuned predefined driver routines provide simple usage model for sparse (CSR format) matrices, as well as band (LAPACK band storage format) and dense matrices.
- RCI (Reverse Communication Interface) routines provide expert usage model that supports arbitrary matrix data formats and user-provided matrix operation and linear system solver routines.
- FORTRAN77, FORTRAN90, and C interfaces.
- Real and complex, single precision and double precision data types are all supported.
Usage models
Most users should use the predefined driver routines because they are easy to use and they are optimized by using the high performance Intel MKL BLAS, LAPACK, and linear solvers internally. The steps of using the predefined driver routines are:
- Call feastinit(fpm) to initialize the array of input parameters.
- Modify the values for the input parameters, if necessary.
- Call one of the predefined driver routines to solve the Eigenvalue problem.
Parameters that affect the behavior of the solver are stored in fpm, which is an array of integers (size = 128). Function feastinit populates this array with default values of all parameters. The default values are usually fine with most problems. A few parameters that a user may want to override are summarized below:
Parameters | Default Values | Descriptions |
fpm(2) | 8 | The number of contour points. It must be one of {3,4,5,6,8,10,12,16,20,24,32,40,48}. |
fpm(3) | 12 | Error trace double precision stopping criteria, 10^{-fpm(3)}. |
fpm(4) | 20 | Maximum number of refinement loops allowed. If no convergence is reached after this much iterations, the solver routine returns an error. |
fpm(5) | 0 | Whether the initial subspace is supplied by user or generated by the solver. The default is to be generated by the solver. |
fpm(7) | 5 | Error trace single precision stopping criteria, 10^{-fpm(7)}. |
The predefined driver rountes are:
(The ? symbol is one of s, d, c, z for single precision real, double precision real, single precision complex, and double precision complex data types, respectively)
Matrix Type | Standard Eigenvalue Problems | Generalized Eigenvalue Problems |
Sparse | ?feast_scsrev ?feast_hcsrev | ?feast_scsrgv ?feast_hcsrgv |
Band | ?feast_sbev ?feast_hbev | ?feast_sbgv ?feast_hbgv |
Dense | ?feast_syev ?feast_heev | ?feast_sygv ?feast_hegv |
As an example,here is the C syntax for sfeast_scsrev, which solves a single precision Eigenvalue problem for a sparse symmetric matrix:
void sfeast_scsrev( const char* uplo, /* Upper or lower triangular part of the matrix */ const int* n, /* Size of the problem */ const float* a, /* a, ia, and ja represent the sparse matrix stored in the CSR format */ const int* ia, const int* ja, int* fpm, /* Parameter array initialized by feastinit */ float* epsout, /* On output contains the relative error on the trace */ int* loop, /* On output contains the number of refinement loops needed to converge */ const float* emin, /* The lower bound of the interval to be searched for Eigenvalues */ const float* emax, /* The upper bound of the interval to be searched for Eigenvalues */ int* m0, /* The initial guess of subspace dimension to be used */ float* e, /* On output, this array contains the Eigenvalues found in the interval */ float* x, /* On output, contains all Eigenvectors corresponding to e */ int* m, /* On output, the total numbers of Eigenvalues found in the interval */ float* res, /* On output, the relative residual vector of length m */ int* info /* On output, contains error code or 0 if the execution is successful */ );
Refer to the "Extended Eigensolver Routines" chapter in the Intel MKL reference manual for the syntax of other predefined driver routines.
RCI routines provide a second usage model, which supports arbitrary matrix data formats and user-supplied matrix operations and linear system solvers. Using RCI routines is more involved than using the predefined driver routines. Roughly speaking, it takes the following steps:
- Call feastinit(fpm) to initialize the array of input parameters.
- Modify values for the input parameters, if necessary.
- Call an RCI routine. Depending on the output, user needs to perform one of the following operations with user-provided routines:
Factorize the matrix at a certain contour point z_{e} : (z_{e}*B - A).
Solve a linear system with multiple right hand sides: (z_{e}*B - A)x = Y, or (z_{e}*B - A)^{H}x = Y.
Matrix multiplication: BY = X, or AX = Y - Repeat step 3 until the output of the RCI routine indicates successful completion of the solving process.
Refer to the "Extended Eigensolver Routines" chapter in the Intel MKL reference manual for the detail discussion about RCI routines and code samples.
Performance tips
- The implementation of the Extended Eigensolver is not explicitly threaded. Parallelism depends on the underlying BLAS and linear system solver routines. The predefined driver routines internally use the highly parallel and high performance BLAS, LAPACK, and PARDISO components provided by Intel MKL. For the RCI routines, the performance of user-provided matrix multiplication, factorization, and linear solver routines is critical.
- The performance of the Extended Eigensolver is sensitive to the initial guess of subspace size supplied by the user (m0) and the number of contour points, fpm(2). m0 should be much smaller than the given problem size, and the number of contour points should be small. On the other hand, there is a trade-off between performance and accuracy. As a rule of thumb, when seeking no more than 1000 Eigenvalues, m0 = 1.5 x m, where m is the number of Eigenvalues; and the number of contour points should be 8 or 16.
- In case the search interval is big and the number of Eigenvalues is expected to be large (> 1000), then it makes sense to split the search interval to multiple smaller independent intervals and solve these multiple Eigenvalue problems in parallel, e.g. using multiple user threads.
References
- Intel Math Kernel Library Reference Manual. The chapter on Extended Eigensolver Routines.
- The FEAST Eigenvalue Solver website (http://www.ecs.umass.edu/~polizzi/feast/)
- Normal 0 false false false EN-US ZH-CN X-NONE MicrosoftInternetExplorer4 Ping Tak Peter and Eric Polizzi, Subspace Iteration with Approximate Spectral Projection, SIAM Journal of Matrix Analysis and Applications, submitted. (http://arxiv.org/abs/1302.0432)
- FORTRAN code samples: <MKL install location>/examples/solvers_eef
- C code samples: <MKL install location>/examples/solvers_eec