Contents

p?stein

Computes the eigenvectors of a tridiagonal matrix using inverse iteration.

Syntax

void
psstein
(
MKL_INT
*n
,
float
*d
,
float
*e
,
MKL_INT
*m
,
float
*w
,
MKL_INT
*iblock
,
MKL_INT
*isplit
,
float
*orfac
,
float
*z
,
MKL_INT
*iz
,
MKL_INT
*jz
,
MKL_INT
*descz
,
float
*work
,
MKL_INT
*lwork
,
MKL_INT
*iwork
,
MKL_INT
*liwork
,
MKL_INT
*ifail
,
MKL_INT
*iclustr
,
float
*gap
,
MKL_INT
*info
);
void
pdstein
(
MKL_INT
*n
,
double
*d
,
double
*e
,
MKL_INT
*m
,
double
*w
,
MKL_INT
*iblock
,
MKL_INT
*isplit
,
double
*orfac
,
double
*z
,
MKL_INT
*iz
,
MKL_INT
*jz
,
MKL_INT
*descz
,
double
*work
,
MKL_INT
*lwork
,
MKL_INT
*iwork
,
MKL_INT
*liwork
,
MKL_INT
*ifail
,
MKL_INT
*iclustr
,
double
*gap
,
MKL_INT
*info
);
void
pcstein
(
MKL_INT
*n
,
float
*d
,
float
*e
,
MKL_INT
*m
,
float
*w
,
MKL_INT
*iblock
,
MKL_INT
*isplit
,
float
*orfac
,
MKL_Complex8
*z
,
MKL_INT
*iz
,
MKL_INT
*jz
,
MKL_INT
*descz
,
float
*work
,
MKL_INT
*lwork
,
MKL_INT
*iwork
,
MKL_INT
*liwork
,
MKL_INT
*ifail
,
MKL_INT
*iclustr
,
float
*gap
,
MKL_INT
*info
);
void
pzstein
(
MKL_INT
*n
,
double
*d
,
double
*e
,
MKL_INT
*m
,
double
*w
,
MKL_INT
*iblock
,
MKL_INT
*isplit
,
double
*orfac
,
MKL_Complex16
*z
,
MKL_INT
*iz
,
MKL_INT
*jz
,
MKL_INT
*descz
,
double
*work
,
MKL_INT
*lwork
,
MKL_INT
*iwork
,
MKL_INT
*liwork
,
MKL_INT
*ifail
,
MKL_INT
*iclustr
,
double
*gap
,
MKL_INT
*info
);
Include Files
  • mkl_scalapack.h
Description
The
p?stein
function
computes the eigenvectors of a symmetric tridiagonal matrix
T
corresponding to specified eigenvalues, by inverse iteration.
p?stein
does not orthogonalize vectors that are on different processes. The extent of orthogonalization is controlled by the input parameter
lwork
. Eigenvectors that are to be orthogonalized are computed by the same process.
p?stein
decides on the allocation of work among the processes and then calls
?stein2
(modified LAPACK
function
) on each individual process. If insufficient workspace is allocated, the expected orthogonalization may not be done.
If the eigenvectors obtained are not orthogonal, increase
lwork
and run the code again.
p
=
NPROW
*
NPCOL
is the total number of processes.
Input Parameters
n
(global) The order of the matrix
T
(
n
0)
.
m
(global) The number of eigenvectors to be returned.
d
,
e
,
w
(global)
Arrays:
d
of size
n
contains the diagonal elements of
T
.
e
of size
n
-1
contains the off-diagonal elements of
T
.
w
of size
m
contains all the eigenvalues grouped by split-off block. The eigenvalues are supplied from smallest to largest within the block. (Here the output array
w
from
p?stebz
with order
=
'B'
is expected. The array should be replicated in all processes.)
iblock
(global)
Array of size
n
. The submatrix indices associated with the corresponding eigenvalues in
w
: 1
for eigenvalues belonging to the first submatrix from the top, 2 for those belonging to the second submatrix, etc. (The output array
iblock
from
p?stebz
is expected here).
isplit
(global)
Array of size
n
. The splitting points at which
T
breaks up into submatrices. The first submatrix consists of rows/columns 1 to
isplit
[0]
, the second of rows/columns
isplit
[0]
+1
through
isplit
[1]
, and so on, and the
nsplit
-th submatrix consists of rows/columns
isplit
[
nsplit
-2]
+1
through
isplit
[
nsplit
-1]
=
n
. (The output array
isplit
from
p?stebz
is expected here.)
orfac
(global)
orfac
specifies which eigenvectors should be orthogonalized. Eigenvectors that correspond to eigenvalues within
orfac
*||T||
of each other are to be orthogonalized. However, if the workspace is insufficient (see
lwork
), this tolerance may be decreased until all eigenvectors can be stored in one process. No orthogonalization is done if
orfac
is equal to zero. A default value of 1000 is used if
orfac
is negative.
orfac
should be identical on all processes
iz
,
jz
(global) The row and column indices in the global matrix
Z
indicating the first row and the first column of the submatrix
Z
, respectively.
descz
(global and local) array of size
dlen_
. The array descriptor for the distributed matrix
Z
.
work
(local).
Workspace array of size
lwork
.
lwork
(local)
lwork
controls the extent of orthogonalization which can be done. The number of eigenvectors for which storage is allocated on each process is
nvec
=
floor
((
lwork
-max(5*
n
,
np
00*
mq
00))/
n
)
. Eigenvectors corresponding to eigenvalue clusters of size (
nvec
-
ceil
(
m
/
p
) + 1)
are guaranteed to be orthogonal (the orthogonality is similar to that obtained from
?stein2
).
lwork
must be no smaller than
max(5*
n
,
np
00*
mq
00) +
ceil
(
m
/
p
)*
n
and should have the same input value on all processes.
It is the minimum value of
lwork
input on different processes that is significant.
If
lwork
= -1
, then
lwork
is global input and a workspace query is assumed; the
function
only calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by
pxerbla
.
iwork
(local)
Workspace array of size
3
n
+
p
+1
.
liwork
(local) The size of the array
iwork
. It must be greater than
3*
n
+
p
+1
.
If
liwork
= -1
, then
liwork
is global input and a workspace query is assumed; the
function
only calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by
pxerbla
.
Output Parameters
z
(local)
Array of size
descz
[
dlen_
-1]
,
n
/
NPCOL
+
NB
)
.
z
contains the computed eigenvectors associated with the specified eigenvalues. Any vector which fails to converge is set to its current iterate after MAXIT iterations (See
?stein2
). On output,
z
is distributed across the
p
processes in block cyclic format.
work
On exit,
work
[0]
gives a lower bound on the workspace (
lwork
) that guarantees the user desired orthogonalization (see
orfac
). Note that this may overestimate the minimum workspace needed.
iwork
On exit,
iwork
[0]
contains the amount of integer workspace required.
On exit, the
iwork
[1]
through
iwork
[p+1]
indicate the eigenvectors computed by each process. Process
i
computes eigenvectors indexed
iwork
[i+1]
+1
through
iwork
[i+2]
.
ifail
(global) Array of size
m
. On normal exit, all elements of
ifail
are zero. If one or more eigenvectors fail to converge after MAXIT iterations (as in
?stein
), then
info
> 0
is returned. If
mod(
info
,
m
+1)>0
, then for
i
=1
to
mod
(
info
,
m
+1)
, the eigenvector corresponding to the eigenvalue
w
[
ifail
[i-1]-1]
failed to converge (
w
refers to the array of eigenvalues on output).
mod(
x
,
y
)
is the integer remainder of
x
/
y
.
iclustr
(global) Array of size
2*
p
.
This output array contains indices of eigenvectors corresponding to a cluster of eigenvalues that could not be orthogonalized due to insufficient workspace (see
lwork
,
orfac
and
info
). Eigenvectors corresponding to clusters of eigenvalues indexed
iclustr
(2*I-1)
to
iclustr
(2*I)
,
i
= 1
to
info
/(
m
+1)
, could not be orthogonalized due to lack of workspace. Hence the eigenvectors corresponding to these clusters may not be orthogonal.
iclustr
is a zero terminated array:
iclustr
[2*
k
-1]
0 and
iclustr
[2*
k
]
= 0 if and only if
k
is the number of clusters.
gap
(global)
This output array contains the gap between eigenvalues whose eigenvectors could not be orthogonalized. The
info
/
m
output values in this array correspond to the
info
/(
m
+1)
clusters indicated by the array
iclustr
. As a result, the dot product between eigenvectors corresponding to the
i-th
cluster may be as high as
(
O
(
n
)*
macheps
)/
gap
[
i
-1]
.
info
(global)
If
info
= 0
, the execution is successful.
If
info
< 0
: If the
i
-th argument is an array and the
j
-th entry
, indexed
j-
1,
had an illegal value, then
info
=
-(
i
*100+
j
),
If the
i
-th argument is a scalar and had an illegal value, then
info
=
-
i
.
If
info
< 0
: if
info
=
-
i
, the
i
-th argument had an illegal value.
If
info
> 0
: if
mod
(
info
,
m
+1)
=
i
, then
i
eigenvectors failed to converge in MAXIT iterations. Their indices are stored in the array
ifail
. If
info
/(
m
+1)
=
i
, then eigenvectors corresponding to
i
clusters of eigenvalues could not be orthogonalized due to insufficient workspace. The indices of the clusters are stored in the array
iclustr
.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reservered for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804