p?stein

Computes the eigenvectors of a tridiagonal matrix using inverse iteration.

Syntax

Fortran:

call psstein(n, d, e, m, w, iblock, isplit, orfac, z, iz, jz, descz, work, lwork, iwork, liwork, ifail, iclustr, gap, info)

call pdstein(n, d, e, m, w, iblock, isplit, orfac, z, iz, jz, descz, work, lwork, iwork, liwork, ifail, iclustr, gap, info)

call pcstein(n, d, e, m, w, iblock, isplit, orfac, z, iz, jz, descz, work, lwork, iwork, liwork, ifail, iclustr, gap, info)

call pzstein(n, d, e, m, w, iblock, isplit, orfac, z, iz, jz, descz, work, lwork, iwork, liwork, ifail, iclustr, gap, info)

C:

void psstein (MKL_INT *n , float *d , float *e , MKL_INT *m , float *w , MKL_INT *iblock , MKL_INT *isplit , float *orfac , float *z , MKL_INT *iz , MKL_INT *jz , MKL_INT *descz , float *work , MKL_INT *lwork , MKL_INT *iwork , MKL_INT *liwork , MKL_INT *ifail , MKL_INT *iclustr , float *gap , MKL_INT *info );

void pdstein (MKL_INT *n , double *d , double *e , MKL_INT *m , double *w , MKL_INT *iblock , MKL_INT *isplit , double *orfac , double *z , MKL_INT *iz , MKL_INT *jz , MKL_INT *descz , double *work , MKL_INT *lwork , MKL_INT *iwork , MKL_INT *liwork , MKL_INT *ifail , MKL_INT *iclustr , double *gap , MKL_INT *info );

void pcstein (MKL_INT *n , float *d , float *e , MKL_INT *m , float *w , MKL_INT *iblock , MKL_INT *isplit , float *orfac , MKL_Complex8 *z , MKL_INT *iz , MKL_INT *jz , MKL_INT *descz , float *work , MKL_INT *lwork , MKL_INT *iwork , MKL_INT *liwork , MKL_INT *ifail , MKL_INT *iclustr , float *gap , MKL_INT *info );

void pzstein (MKL_INT *n , double *d , double *e , MKL_INT *m , double *w , MKL_INT *iblock , MKL_INT *isplit , double *orfac , MKL_Complex16 *z , MKL_INT *iz , MKL_INT *jz , MKL_INT *descz , double *work , MKL_INT *lwork , MKL_INT *iwork , MKL_INT *liwork , MKL_INT *ifail , MKL_INT *iclustr , double *gap , MKL_INT *info );

Include Files

  • C: mkl_scalapack.h

Description

The p?stein routine computes the eigenvectors of a symmetric tridiagonal matrix T corresponding to specified eigenvalues, by inverse iteration. p?stein does not orthogonalize vectors that are on different processes. The extent of orthogonalization is controlled by the input parameter lwork. Eigenvectors that are to be orthogonalized are computed by the same process. p?stein decides on the allocation of work among the processes and then calls ?stein2 (modified LAPACK routine) on each individual process. If insufficient workspace is allocated, the expected orthogonalization may not be done.

Note

If the eigenvectors obtained are not orthogonal, increase lwork and run the code again.

p = NPROW*NPCOL is the total number of processes.

Input Parameters

n

(global) INTEGER. The order of the matrix T (n 0).

m

(global) INTEGER. The number of eigenvectors to be returned.

d, e, w

(global)

REAL for single-precision flavors

DOUBLE PRECISION for double-precision flavors.

Arrays: d(*) contains the diagonal elements of T.

size (n).

e(*) contains the off-diagonal elements of T.

size (n-1).

w(*) contains all the eigenvalues grouped by split-off block.The eigenvalues are supplied from smallest to largest within the block. (Here the output array w from p?stebz with order = 'B' is expected. The array should be replicated in all processes.)

size(m)

iblock

(global) INTEGER.

Array, size (n). The submatrix indices associated with the corresponding eigenvalues in w--1 for eigenvalues belonging to the first submatrix from the top, 2 for those belonging to the second submatrix, etc. (The output array iblock from p?stebz is expected here).

isplit

(global) INTEGER.

Array, size (n). The splitting points, at which T breaks up into submatrices. The first submatrix consists of rows/columns 1 to isplit(1), the second of rows/columns isplit(1)+1 through isplit(2), etc., and the nsplit-th consists of rows/columns isplit(nsplit-1)+1 through isplit(nsplit)=n . (The output array isplit from p?stebz is expected here.)

orfac

(global)

REAL for single-precision flavors

DOUBLE PRECISION for double-precision flavors.

orfac specifies which eigenvectors should be orthogonalized. Eigenvectors that correspond to eigenvalues within orfac*||T|| of each other are to be orthogonalized. However, if the workspace is insufficient (see lwork), this tolerance may be decreased until all eigenvectors can be stored in one process. No orthogonalization is done if orfac is equal to zero. A default value of 1000 is used if orfac is negative. orfac should be identical on all processes

iz, jz

(global) INTEGER. The row and column indices in the global array z indicating the first row and the first column of the submatrix Z, respectively.

descz

(global and local) INTEGER array, dimension (dlen_). The array descriptor for the distributed matrix Z.

work

(local). REAL for single-precision flavors

DOUBLE PRECISION for double-precision flavors.

Workspace array, size (lwork).

lwork

(local) INTEGER.

lwork controls the extent of orthogonalization which can be done. The number of eigenvectors for which storage is allocated on each process is nvec = floor((lwork-max(5*n,np00*mq00))/n). Eigenvectors corresponding to eigenvalue clusters of size nvec - ceil(m/p) + 1 are guaranteed to be orthogonal (the orthogonality is similar to that obtained from ?stein2).

Note

lwork must be no smaller than max(5*n,np00*mq00) + ceil(m/p)*n and should have the same input value on all processes.

It is the minimum value of lwork input on different processes that is significant.

If lwork = -1, then lwork is global input and a workspace query is assumed; the routine only calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by pxerbla.

iwork

(local) INTEGER.

Workspace array, size (3n+p+1).

liwork

(local) INTEGER. The size of the array iwork. It must be greater than (3*n+p+1).

If liwork = -1, then liwork is global input and a workspace query is assumed; the routine only calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by pxerbla.

Output Parameters

z

(local)

REAL for psstein

DOUBLE PRECISION for pdstein

COMPLEX for pcstein

DOUBLE COMPLEX for pzstein.

Array, size (descz(dlen_), n/NPCOL + NB). z contains the computed eigenvectors associated with the specified eigenvalues. Any vector which fails to converge is set to its current iterate after MAXIT iterations (See ?stein2). On output, z is distributed across the p processes in block cyclic format.

work(1)

On exit, work(1) gives a lower bound on the workspace (lwork) that guarantees the user desired orthogonalization (see orfac). Note that this may overestimate the minimum workspace needed.

iwork

On exit, iwork(1) contains the amount of integer workspace required.

On exit, the iwork(2) through iwork(p+2) indicate the eigenvectors computed by each process. Process i computes eigenvectors indexed iwork(i+2)+1 through iwork(i+3).

ifail

(global) INTEGER. Array, size (m). On normal exit, all elements of ifail are zero. If one or more eigenvectors fail to converge after MAXIT iterations (as in ?stein), then info > 0 is returned. If mod(info, m+1)>0, then for i=1 to mod(info,m+1), the eigenvector corresponding to the eigenvalue w(ifail(i)) failed to converge (w refers to the array of eigenvalues on output).

iclustr

(global) INTEGER. Array, size (2*p)

This output array contains indices of eigenvectors corresponding to a cluster of eigenvalues that could not be orthogonalized due to insufficient workspace (see lwork, orfac and info). Eigenvectors corresponding to clusters of eigenvalues indexed iclustr(2*I-1) to iclustr(2*I), i = 1 to info/(m+1), could not be orthogonalized due to lack of workspace. Hence the eigenvectors corresponding to these clusters may not be orthogonal. iclustr is a zero terminated array ---(iclustr(2*k).ne.0.and.iclustr(2*k+1).eq.0) if and only if k is the number of clusters.

gap

(global)

REAL for single-precision flavors

DOUBLE PRECISION for double-precision flavors.

This output array contains the gap between eigenvalues whose eigenvectors could not be orthogonalized. The info/m output values in this array correspond to the info/(m+1) clusters indicated by the array iclustr. As a result, the dot product between eigenvectors corresponding to the i-th cluster may be as high as (O(n)*macheps)/gap(i).

info

(global) INTEGER.

If info = 0, the execution is successful.

If info < 0: If the i-th argument is an array and the j-entry had an illegal value, then info = -(i*100+j),

If the i-th argument is a scalar and had an illegal value, then info = -i.

If info < 0: if info = -i, the i-th argument had an illegal value.

If info > 0: if mod(info, m+1) = i, then i eigenvectors failed to converge in MAXIT iterations. Their indices are stored in the array ifail. If info/(m+1) = i, then eigenvectors corresponding to i clusters of eigenvalues could not be orthogonalized due to insufficient workspace. The indices of the clusters are stored in the array iclustr.

For more complete information about compiler optimizations, see our Optimization Notice.