p?heevx
p?heevx
Computes selected eigenvalues and, optionally, eigenvectors of a Hermitian matrix.
Syntax
void
pcheevx
(
char
*jobz
,
char
*range
,
char
*uplo
,
MKL_INT
*n
,
MKL_Complex8
*a
,
MKL_INT
*ia
,
MKL_INT
*ja
,
MKL_INT
*desca
,
float
*vl
,
float
*vu
,
MKL_INT
*il
,
MKL_INT
*iu
,
float
*abstol
,
MKL_INT
*m
,
MKL_INT
*nz
,
float
*w
,
float
*orfac
,
MKL_Complex8
*z
,
MKL_INT
*iz
,
MKL_INT
*jz
,
MKL_INT
*descz
,
MKL_Complex8
*work
,
MKL_INT
*lwork
,
float
*rwork
,
MKL_INT
*lrwork
,
MKL_INT
*iwork
,
MKL_INT
*liwork
,
MKL_INT
*ifail
,
MKL_INT
*iclustr
,
float
*gap
,
MKL_INT
*info
);
void
pzheevx
(
char
*jobz
,
char
*range
,
char
*uplo
,
MKL_INT
*n
,
MKL_Complex16
*a
,
MKL_INT
*ia
,
MKL_INT
*ja
,
MKL_INT
*desca
,
double
*vl
,
double
*vu
,
MKL_INT
*il
,
MKL_INT
*iu
,
double
*abstol
,
MKL_INT
*m
,
MKL_INT
*nz
,
double
*w
,
double
*orfac
,
MKL_Complex16
*z
,
MKL_INT
*iz
,
MKL_INT
*jz
,
MKL_INT
*descz
,
MKL_Complex16
*work
,
MKL_INT
*lwork
,
double
*rwork
,
MKL_INT
*lrwork
,
MKL_INT
*iwork
,
MKL_INT
*liwork
,
MKL_INT
*ifail
,
MKL_INT
*iclustr
,
double
*gap
,
MKL_INT
*info
);
Include Files
- mkl_scalapack.h
Description
The
p?heevx
function
computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A
by calling the recommended sequence of ScaLAPACK functions
. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices for the desired eigenvalues.Optimization Notice
|
---|
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
|
This notice covers the following instruction sets: SSE2, SSE4.2, AVX2, AVX-512.
Input Parameters
np
= the number of rows local to a given process. nq
= the number of columns local to a given process. - jobz
- (global) Must be'N'or'V'.Specifies if it is necessary to compute the eigenvectors:If, then only eigenvalues are computed.jobz='N'If, then eigenvalues and eigenvectors are computed.jobz='V'
- range
- (global) Must be'A','V', or'I'.If, all eigenvalues will be found.range='A'If, all eigenvalues in the half-open intervalrange='V'[will be found.vl,vu]If, the eigenvalues with indicesrange='I'ilthroughiuwill be found.
- uplo
- (global) Must be'U'or'L'.Specifies whether the upper or lower triangular part of the Hermitian matrixAis stored:If,uplo='U'astores the upper triangular part ofA.If,uplo='L'astores the lower triangular part ofA.
- n
- (global) The number of rows and columns of the matrixA(.n≥0)
- a
- (local).Block cyclic array of global sizeand local sizen*n. On entry, the Hermitian matrixlld_a*LOCc(ja+n-1)A.If, only the upper triangular part ofuplo='U'Ais used to define the elements of the Hermitian matrix.If, only the lower triangular part ofuplo='L'Ais used to define the elements of the Hermitian matrix.
- ia,ja
- (global) The row and column indices in the global matrixAindicating the first row and the first column of the submatrixA, respectively.
- desca
- (global and local) array of sizedlen_. The array descriptor for the distributed matrixA. Ifis incorrect,desca[ctxt_- 1]p?heevxcannot guarantee correct error reporting.
- vl,vu
- (global)If, the lower and upper bounds of the interval to be searched for eigenvalues; not referenced ifrange='V'orrange='A''I'.
- il,iu
- (global)If, the indices of the smallest and largest eigenvalues to be returned.range='I'Constraints:il≥ 1;min(.il,n) ≤iu≤nNot referenced iforrange='A''V'.
- abstol
- (global).If, settingjobz='V'abstoltop?lamch(context,'U') yields the most orthogonal eigenvectors.The absolute error tolerance for the eigenvalues. An approximate eigenvalue is accepted as converged when it is determined to lie in an interval[of width less than or equal toa,b], whereabstol+eps*max(|a|,|b|)epsis the machine precision. Ifabstolis less than or equal to zero, thenwill be used in its place, whereeps*norm(T)norm(is the 1-norm of the tridiagonal matrix obtained by reducingT)Ato tridiagonal form.Eigenvalues are computed most accurately whenabstolis set to twice the underflow threshold2*, not zero. If thisp?lamch('S')functionreturns with((, indicating that some eigenvalues or eigenvectors did not converge, try settingmod(info,2)≠0).or.(mod(info/8,2)≠0))abstolto2*.p?lamch('S')mod(is the integer remainder ofx,y).x/y
- orfac
- (global).Specifies which eigenvectors should be reorthogonalized. Eigenvectors that correspond to eigenvalues which are withintol=orfac*norm(A) of each other are to be reorthogonalized. However, if the workspace is insufficient (seelwork),tolmay be decreased until all eigenvectors to be reorthogonalized can be stored in one process. No reorthogonalization will be done iforfacequals zero. A default value of 1.0e-3 is used iforfacis negative.orfacshould be identical on all processes.
- iz,jz
- (global) The row and column indices in the global matrixZindicating the first row and the first column of the submatrixZ, respectively.
- descz
- (global and local) array of sizedlen_. The array descriptor for the distributed matrixZ.must equaldescz[ctxt_- 1]desca[.ctxt_- 1]
- work
- (local).Array of sizelwork.
- lwork
- (local) The size of the arraywork.If only eigenvalues are requested:lwork≥n+max(nb*(np0 + 1), 3)If eigenvectors are requested:lwork≥n+ (np0+mq0+nb)*nbwith.nq0 =numroc(nn,nb, 0, 0,NPCOL)lwork≥5*n+max(5*nn,np0*mq0+2*nb*nb) +iceil(neig,NPROW*NPCOL)*nnFor optimal performance, greater workspace is needed, that islwork≥max(lwork,nhetrd_lwork)wherelworkis as defined above, andnhetrd_lwork=n+ 2*(anb+1)*(4*nps+2) + (nps+1)*npsictxt=desca[ctxt_- 1]anb=pjlaenv(ictxt, 3, 'pchettrd','L', 0, 0, 0, 0)sqnpc=sqrt(dble(NPROW*NPCOL))nps=max(numroc(n, 1, 0, 0,sqnpc), 2*anb)If, thenlwork= -1lworkis global input and a workspace query is assumed; thefunctiononly calculates the size required for optimal performance for all work arrays. Each of these values is returned in the first entry of the corresponding work arrays, and no error message is issued bypxerbla.
- rwork
- (local)Workspace array of sizelrwork.
- lrwork
- (local) The size of the arraywork.See below for definitions of variables used to definelwork.If no eigenvectors are requested (), thenjobz='N'lrwork≥5*nn+4*n.If eigenvectors are requested (), then the amount of workspace required to guarantee that all eigenvectors are computed is:jobz='V'lrwork≥4*n+max(5*nn,np0*mq0+2*nb*nb) +iceil(neig,NPROW*NPCOL)*nnThe computed eigenvectors may not be orthogonal if the minimal workspace is supplied andorfacis too small. If you want to guarantee orthogonality (at the cost of potentially poor performance) you should add the following values tolrwork:(,clustersize-1)*nwhereclustersizeis the number of eigenvalues in the largest cluster, where a cluster is defined as a set of close eigenvalues:{w[k- 1],...,w[k+clustersize-2]|w[j] ≤w[j-1]+orfac*2*norm(A)}.Variable definitions:neig= number of eigenvectors requested;;nb=desca[mb_- 1] =desca[nb_- 1] =descz[mb_- 1] =descz[nb_- 1];nn=max(n,NB, 2);desca[rsrc_- 1] =desca[nb_- 1] =descz[rsrc_- 1] =descz[csrc_- 1] = 0np0 =numroc(nn,nb, 0, 0,NPROW);mq0 =numroc(max(neig,nb, 2),nb, 0, 0,NPCOL);is a ScaLAPACK function returning ceiling(iceil(x,y)x/y)Whenlrworkis too small:Iflworkis too small to guarantee orthogonality,p?heevxattempts to maintain orthogonality in the clusters with the smallest spacing between the eigenvalues. Iflworkis too small to compute all the eigenvectors requested, no computation is performed andinfo= -23 is returned. Note that when,range='V'p?heevxdoes not know how many eigenvectors are requested until the eigenvalues are computed. Therefore, whenand as long asrange='V'lworkis large enough to allowp?heevxto compute the eigenvalues,p?heevxwill compute the eigenvalues and as many eigenvectors as it can.Relationship between workspace, orthogonality and performance:If, then providing enough space to compute all the eigenvectors orthogonally will cause serious degradation in performance. In the limit (that is,clustersize≥n/sqrt(NPROW*NPCOL)clustersize=n-1)p?steinwill perform no better than?steinon 1 processor.Forreorthogonalizing all eigenvectors will increase the total execution time by a factor of 2 or more.clustersize=n/sqrt(NPROW*NPCOL)Forexecution time will grow as the square of the cluster size, all other factors remaining equal and assuming enough workspace. Less workspace means less reorthogonalization but faster execution.clustersize>n/sqrt(NPROW*NPCOL)If, thenlwork= -1lworkis global input and a workspace query is assumed; thefunctiononly calculates the size required for optimal performance for all work arrays. Each of these values is returned in the first entry of the corresponding work arrays, and no error message is issued bypxerbla.
- iwork
- (local) Workspace array.
- liwork
- (local), size ofiwork.liwork≥ 6*nnpWhere:nnp=max(n,NPROW*NPCOL+1, 4)If, thenliwork= -1liworkis global input and a workspace query is assumed; thefunctiononly calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued bypxerbla.
Output Parameters
- a
- On exit, the lower triangle (if), or the upper triangle (ifuplo='L') ofuplo='U'A, including the diagonal, is overwritten.
- m
- (global) The total number of eigenvalues found;0 ≤.m≤n
- nz
- (global) Total number of eigenvectors computed.0 ≤.nz≤mThe number of columns ofzthat are filled.If,jobz≠'V'nzis not referenced.If,jobz='V'unless the user supplies insufficient space andnz=m