Developer Reference

  • 0.9
  • 09/09/2020
  • Public Content
Contents

?stegr2b

From eigenvalues and initial representations computes the selected eigenvalues and eigenvectors of the real symmetric tridiagonal matrix in parallel on multiple processors.

Syntax

void sstegr2b
(
char*
jobz
,
MKL_INT*
n
,
float*
d
,
float*
e
,
MKL_INT*
m
,
float*
w
,
float*
z
,
MKL_INT*
ldz
,
MKL_INT*
nzc
,
MKL_INT*
isuppz
,
float*
work
,
MKL_INT*
lwork
,
MKL_INT*
iwork
,
MKL_INT*
liwork
,
MKL_INT*
dol
,
MKL_INT*
dou
,
MKL_INT*
needil
,
MKL_INT*
neediu
,
MKL_INT*
indwlc
,
float*
pivmin
,
float*
scale
,
float*
wl
,
float*
wu
,
MKL_INT*
vstart
,
MKL_INT*
finish
,
MKL_INT*
maxcls
,
MKL_INT*
ndepth
,
MKL_INT*
parity
,
MKL_INT*
zoffset
,
MKL_INT*
info
);
void dstegr2b
(
char*
jobz
,
MKL_INT*
n
,
double*
d
,
double*
e
,
MKL_INT*
m
,
double*
w
,
double*
z
,
MKL_INT*
ldz
,
MKL_INT*
nzc
,
MKL_INT*
isuppz
,
double*
work
,
MKL_INT*
lwork
,
MKL_INT*
iwork
,
MKL_INT*
liwork
,
MKL_INT*
dol
,
MKL_INT*
dou
,
MKL_INT*
needil
,
MKL_INT*
neediu
,
MKL_INT*
indwlc
,
double*
pivmin
,
double*
scale
,
double*
wl
,
double*
wu
,
MKL_INT*
vstart
,
MKL_INT*
finish
,
MKL_INT*
maxcls
,
MKL_INT*
ndepth
,
MKL_INT*
parity
,
MKL_INT*
zoffset
,
MKL_INT*
info
);
Include Files
  • mkl_scalapack.h
Description
?stegr2b
should only be called after a call to
?stegr2a
. From eigenvalues and initial representations computed by
?stegr2a
,
?stegr2b
computes the selected eigenvalues and eigenvectors of the real symmetric tridiagonal matrix in parallel on multiple processors. It is potentially invoked multiple times on a given processor because the locally relevant representation tree might depend on spectral information that is "owned" by other processors and might need to be communicated.
Please note:
  • The calling sequence has two additional integer parameters,
    dol
    and
    dou
    , that should satisfy
    m
    dou
    dol
    1. These parameters are only relevant for the case
    jobz
    = 'V'.
    ?stegr2b
    only computes the eigenvectors corresponding to eigenvalues
    dol
    through
    dou
    in
    w
    , indexed
    dol
    -1 through
    dou
    -1
    . (That is, instead of computing the eigenvectors belonging to
    w
    ([0] through
    w
    [
    m
    -1]
    , only the eigenvectors belonging to eigenvalues
    w
    [
    dol
    -1] through
    w
    [
    dou
    -1]
    are computed. In this case, only the eigenvalues
    dol
    through
    dou
    are guaranteed to be accurately refined to all figures by Rayleigh-Quotient iteration.
  • The additional arguments
    vstart
    ,
    finish
    ,
    ndepth
    ,
    parity
    ,
    zoffset
    are included as a thread-safe implementation equivalent to save variables. These variables store details about the local representation tree which is computed layerwise. For scalability reasons, eigenvalues belonging to the locally relevant representation tree might be computed on other processors. These need to be communicated before the inspection of the RRRs can proceed on any given layer. Note that only when the variable
    finish
    is non-zero
    , the computation has ended. All eigenpairs between
    dol
    and
    dou
    have been computed.
    m
    is set to
    dou
    -
    dol
    + 1.
  • ?stegr2b
    needs more workspace in
    z
    than the sequential
    ?stegr
    . It is used to store the conformal embedding of the local representation tree.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
This notice covers the following instruction sets: SSE2, SSE4.2, AVX2, AVX-512.
Input Parameters
jobz
= 'N': Compute eigenvalues only;
= 'V': Compute eigenvalues and eigenvectors.
n
The order of the matrix.
n
0.
d
Array of size
n
The
n
diagonal elements of the tridiagonal matrix T. Overwritten on exit.
e
Array of size
n
The (
n
-1) subdiagonal elements of the tridiagonal matrix
T
in elements
0 to
n
-2
of
e
.
e
[
n
-1]
need not be set on input, but is used internally as workspace. Overwritten on exit.
m
The total number of eigenvalues found in
?stegr2a
. 0
m
n
.
w
Array of size
n
The first
m
elements contain approximations to the selected eigenvalues in ascending order. Note that only the eigenvalues from the locally relevant part of the representation tree, that is all the clusters that include eigenvalues from
dol
through
dou
, are reliable on this processor. (It does not need to know about any others anyway.)
ldz
The leading dimension of the array
z
.
ldz
1, and if
jobz
= 'V', then
ldz
max(1,
n
).
nzc
The number of eigenvectors to be held in the array
z
, storing the matrix
Z
.
lwork
The size of the array
work
.
lwork
max(1,18*
n
)
if
jobz
= 'V', and
lwork
max(1,12*
n
) if
jobz
= 'N'.
If
lwork
= -1, then a workspace query is assumed; the
function
only calculates the optimal size of the
work
array, returns this value as the first entry of the
work
array, and no error message related to
lwork
is issued.
liwork
The size of the array
iwork
.
liwork
max(1,10*
n
) if the eigenvectors are desired, and
liwork
max(1,8*
n
) if only the eigenvalues are to be computed.
If
liwork
= -1, then a workspace query is assumed; the
function
only calculates the optimal size of the
iwork
array, returns this value as the first entry of the
iwork
array, and no error message related to
liwork
is issued.
dol
,
dou
From the eigenvalues
w
[0] through
w
[
m
-1]
, only eigenvectors
Z
(:,
dol
) to
Z
(:,
dou
) are computed.
If
dol
> 1, then
Z
(:,
dol
-1-
zoffset
) is used and overwritten.
If
dou
<
m
, then
Z
(:,
dou
+1-
zoffset
) is used and overwritten.
needil
,
neediu
Describes which are the left and right outermost eigenvalues still to be computed. Initially computed by
?larre2a
, modified in the course of the algorithm.
pivmin
The minimum pivot in the sturm sequence for
T
.
scale
The scaling factor for
T
. Used for unscaling the eigenvalues at the very end of the algorithm.
wl
,
wu
The interval (
wl
,
wu
] contains all the wanted eigenvalues.
vstart
Non-zero
on initialization, set to
zero
afterwards.
finish
Indicates whether all eigenpairs have been computed.
maxcls
The largest cluster worked on by this processor in the representation tree.
ndepth
The current depth of the representation tree. Set to zero on initial pass, changed when the deeper levels of the representation tree are generated.
parity
An internal parameter needed for the storage of the clusters on the current level of the representation tree.
zoffset
Offset for storing the eigenpairs when
z
is distributed in 1D-cyclic fashion.
OUTPUT Parameters
z
Array of size
ldz
* max(1,
m
)
If
jobz
= 'V', and if
info
= 0, then a subset of the first
m
columns of the matrix
Z
, stored in
z
, contain the orthonormal eigenvectors of the matrix
T
corresponding to the selected eigenvalues, with the
i
-th column of
Z
holding the eigenvector associated with
w
[
i
-1]
.
See
dol
,
dou
for more information.
isuppz
array of size 2*max(1,
m
).
The support of the eigenvectors in
z
, i.e., the indices indicating the nonzero elements in
z
. The
i
-th computed eigenvector is nonzero only in elements
isuppz
[ 2*
i
-2 ] through
isuppz
[ 2*
i
-1]
. This is relevant in the case when the matrix is split.
isuppz
is only set if
n
>2.
work
On exit, if
info
= 0,
work
[0]
returns the optimal (and minimal)
lwork
.
iwork
On exit, if
info
= 0,
iwork
[0]
returns the optimal
liwork
.
needil
,
neediu
Modified in the course of the algorithm.
indwlc
Pointer into the workspace location where the local eigenvalue representations are stored. ("Local eigenvalues" are those relative to the individual shifts of the RRRs.)
vstart
Non-zero
on initialization, set to
zero
afterwards.
finish
Indicates whether all eigenpairs have been computed
maxcls
The largest cluster worked on by this processor in the representation tree.
ndepth
The current depth of the representation tree. Set to zero on initial pass, changed when the deeper levels of the representation tree are generated.
parity
An internal parameter needed for the storage of the clusters on the current level of the representation tree.
info
On exit,
info
= 0: successful exit
other:if
info
= -
i
, the
i
-th argument had an illegal value
if
info
= 20
x
, internal error in
?larrv2
.
Here, the digit
x
= abs(
iinfo
) < 10, where
iinfo
is the nonzero error code returned by
?larrv2

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804