Developer Reference

Contents

p?heevr

Computes selected eigenvalues and, optionally, eigenvectors of a Hermitian matrix using Relatively Robust Representation.

Syntax

void pcheevr
(
char*
jobz
,
char*
range
,
char*
uplo
,
MKL_INT*
n
,
MKL_Complex8*
a
,
MKL_INT*
ia
,
MKL_INT*
ja
,
MKL_INT*
desca
,
float*
vl
,
float*
vu
,
MKL_INT*
il
,
MKL_INT*
iu
,
MKL_INT*
m
,
MKL_INT*
nz
,
float*
w
,
MKL_Complex8*
z
,
MKL_INT*
iz
,
MKL_INT*
jz
,
MKL_INT*
descz
,
MKL_Complex8*
work
,
MKL_INT*
lwork
,
float*
rwork
,
MKL_INT*
lrwork
,
MKL_INT*
iwork
,
MKL_INT*
liwork
,
MKL_INT*
info
);
void pzheevr
(
char*
jobz
,
char*
range
,
char*
uplo
,
MKL_INT*
n
,
MKL_Complex16*
a
,
MKL_INT*
ia
,
MKL_INT*
ja
,
MKL_INT*
desca
,
double*
vl
,
double*
vu
,
MKL_INT*
il
,
MKL_INT*
iu
,
MKL_INT*
m
,
MKL_INT*
nz
,
double*
w
,
MKL_Complex16*
z
,
MKL_INT*
iz
,
MKL_INT*
jz
,
MKL_INT*
descz
,
MKL_Complex16*
work
,
MKL_INT*
lwork
,
double*
rwork
,
MKL_INT*
lrwork
,
MKL_INT*
iwork
,
MKL_INT*
liwork
,
MKL_INT*
info
);
Include Files
  • mkl_scalapack.h
Description
p?heevr
computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix
A
distributed in 2D blockcyclic format by calling the recommended sequence of ScaLAPACK
functions
.
First, the matrix
A
is reduced to complex Hermitian tridiagonal form. Then, the eigenproblem is solved using the parallel MRRR algorithm. Last, if eigenvectors have been computed, a backtransformation is done.
Upon successful completion, each processor stores a copy of all computed eigenvalues in
w
. The eigenvector matrix
Z
is stored in 2D block-cyclic format distributed over all processors.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
This notice covers the following instruction sets: SSE2, SSE4.2, AVX2, AVX-512.
Input Parameters
jobz
(global)
Specifies whether or not to compute the eigenvectors:
= 'N': Compute eigenvalues only.
= 'V': Compute eigenvalues and eigenvectors.
range
(global)
= 'A': all eigenvalues will be found.
= 'V': all eigenvalues in the interval [
vl
,
vu
] will be found.
= 'I': the
il
-th through
iu
-th eigenvalues will be found.
uplo
(global)
Specifies whether the upper or lower triangular part of the Hermitian matrix
A
is stored:
= 'U': Upper triangular
= 'L': Lower triangular
n
(global )
The number of rows and columns of the matrix
A
.
n
0
a
Block-cyclic array, global size
n
*
n
)
, local size
lld_a
*
LOC
c
(
ja
+
n
-1)
Contains the local pieces of the Hermitian distributed matrix
A
. If
uplo
= 'U', only the upper triangular part of
a
is used to define the elements of the Hermitian matrix. If
uplo
= 'L', only the lower triangular part of
a
is used to define the elements of the Hermitian matrix.
ia
(global )
Global row index in the global matrix
A
that points to the beginning of the submatrix which is to be operated on. It should be set to 1 when operating on a full matrix.
ja
(global )
Global column index in the global matrix
A
that points to the beginning of the submatrix which is to be operated on. It should be set to 1 when operating on a full matrix.
desca
(global and local) array of size
dlen_
. (The ScaLAPACK descriptor length is
dlen_
= 9.)
The array descriptor for the distributed matrix
a
. The descriptor stores details about the 2D block-cyclic storage, see the notes below. If
desca
is incorrect,
p?heevr
cannot work correctly.
Also note the array alignment requirements specified below
vl
(global)
If
range
='V', the lower bound of the interval to be searched for eigenvalues. Not referenced if
range
= 'A' or 'I'.
vu
(global)
If
range
='V', the upper bound of the interval to be searched for eigenvalues. Not referenced if
range
= 'A' or 'I'.
il
(global )
If
range
='I', the index (from smallest to largest) of the smallest eigenvalue to be returned.
il
1.
Not referenced if
range
= 'A'.
iu
(global )
If
range
='I', the index (from smallest to largest) of the largest eigenvalue to be returned. min(
il
,
n
)
iu
n
.
Not referenced if
range
= 'A'.
iz
(global )
Global row index in the global matrix
Z
that points to the beginning of the submatrix which is to be operated on. It should be set to 1 when operating on a full matrix.
jz
(global )
Global column index in the global matrix
Z
that points to the beginning of the submatrix which is to be operated on. It should be set to 1 when operating on a full matrix.
descz
(global and local) array of size
dlen_
.
The array descriptor for the distributed matrix
z
.
descz
[
ctxt_
- 1]
must equal
desca
[
ctxt_
- 1]
work
(local workspace) array of size
lwork
lwork
(local )
Size of
work
array, must be at least 3.
If only eigenvalues are requested:
lwork
n
+ max(
nb
* (
np00
+ 1 ),
nb
* 3 )
If eigenvectors are requested:
lwork
n
+ (
np00
+
mq00
+
nb
) *
nb
For definitions of
np00
and
mq00
, see
lrwork
.
For optimal performance, greater workspace is needed, i.e.
lwork
max(
lwork
,
nhetrd_lwork
)
Where
lwork
is as defined above, and
nhetrd_lwork
=
n
+ 2*(
anb
+1 )*( 4*
nps
+2 ) + (
nps
+ 1 ) *
nps
ictxt
=
desca
[
ctxt_
- 1]
anb
=
pjlaenv
(
ictxt
, 3, 'PCHETTRD', 'L', 0, 0, 0, 0 )
sqnpc
= sqrt( real(
nprow
*
npcol
) )
nps
= max(
numroc
(
n
, 1, 0, 0,
sqnpc
), 2*
anb
)
If
lwork
= -1, then
lwork
is global input and a workspace query is assumed; the
function
only calculates the optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by pxerbla.
rwork
(local workspace) array of size
lrwork
lrwork
(local )
Size of
rwork
, must be at least 3.
See below for definitions of variables used to define
lrwork
.
If no eigenvectors are requested (
jobz
= 'N') then
lrwork
2 + 5 *
n
+ max( 12 *
n
,
nb
* (
np00
+ 1 ) )
If eigenvectors are requested (
jobz
= 'V' ) then the amount of workspace required is:
lrwork
2 + 5 *
n
+ max( 18*
n
,
np00
*
mq00
+ 2 *
nb
*
nb
) +
(2 +
iceil
(
neig
,
nprow
*
npcol
))*
n
iceil(
x
,
y
)
is the ceiling of
x
/
y
.
Variable definitions:
neig
= number of eigenvectors requested
nb
=
desca
[
mb_
- 1] =
desca
[
nb_
- 1] =
descz
[
mb_
- 1] =
descz
[
nb_
- 1]
nn
= max(
n
,
nb
, 2 )
desca
[
rsrc_
- 1] =
desca
[
csrc_
- 1] =
descz
[
rsrc_
- 1] =
descz
[
csrc_
- 1] = 0
np00
=
numroc
(
nn
,
nb
, 0, 0,
nprow
)
mq00
=
numroc
( max(
neig
,
nb
, 2 ),
nb
, 0, 0,
npcol
)
iceil
(
x
,
y
) is a ScaLAPACK function returning ceiling(
x
/
y
), and
nprow
and
npcol
can be determined by calling the
function
blacs_gridinfo
.
If
lrwork
= -1, then
lrwork
is global input and a workspace query is assumed; the
function
only calculates the size required for optimal performance for all work arrays. Each of these values is returned in the first entry of the corresponding work arrays, and no error message is issued by
pxerbla
iwork
(local workspace) array of size
liwork
liwork
(local )
size of
iwork
Let
nnp
= max(
n
,
nprow
*
npcol
+ 1, 4 ). Then:
liwork
12*
nnp
+ 2*
n
when the eigenvectors are desired
liwork
10*
nnp
+ 2*
n
when only the eigenvalues have to be computed
If
liwork
= -1, then
liwork
is global input and a workspace query is assumed; the
function
only calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by
pxerbla
OUTPUT Parameters
a
The lower triangle (if
uplo
='L') or the upper triangle (if
uplo
='U') of
a
, including the diagonal, is destroyed.
m
(global )
Total number of eigenvalues found. 0
m
n
.
nz
(global )
Total number of eigenvectors computed. 0
nz
m
.
The number of columns of
z
that are filled.
If
jobz
'V',
nz
is not referenced.
If
jobz
= 'V',
nz
=
m
w
(global ) array of size
n
On normal exit, the first
m
entries contain the selected eigenvalues in ascending order.
z
(local ) array, global size
n
*
n
)
, local size
lld_z
*
LOC
c
(
jz
+
n
-1)
If
jobz
= 'V', then on normal exit the first
m
columns of
z
contain the orthonormal eigenvectors of the matrix corresponding to the selected eigenvalues.
If
jobz
= 'N', then
z
is not referenced.
work
work
[0]
returns workspace adequate workspace to allow optimal performance.
rwork
On return,
rwork
[0]
contains the optimal amount of workspace required for efficient execution. if
jobz
='N'
rwork
[0]
= optimal amount of workspace required to compute the eigenvalues. if
jobz
='V'
rwork
[0]
= optimal amount of workspace required to compute eigenvalues and eigenvectors.
iwork
On return,
iwork
[0]
contains the amount of integer workspace required.
info
(global )
= 0: successful exit
< 0: If the
i
-th argument is an array and the
j
-th entry had an illegal value, then
info
= -(
i
*100+
j
), if the
i
-th argument is a scalar and had an illegal value, then
info
= -
i
.
Application Notes
The distributed submatrices
a
(
ia
:*,
ja
:*) and
z
(
iz
:
iz
+
m
-1,
jz
:
jz
+
n
-1) must satisfy the following alignment properties:
  1. Identical (quadratic) dimension:
    desca
    [
    m_
    - 1]
    =
    descz
    [
    m_
    - 1]
    =
    desca
    [
    n_
    - 1]
    =
    descz
    [
    n_
    - 1]
  2. Quadratic conformal blocking:
    desca
    [
    mb_
    - 1]
    =
    desca
    [
    nb_
    - 1]
    =
    descz
    [
    mb_
    - 1]
    =
    descz
    [
    nb_
    - 1]
    ,
    desca
    [
    rsrc_
    - 1]
    =
    descz
    [
    rsrc_
    - 1]
  3. mod(
    ia
    -1,
    mb_a
    ) = mod(
    iz
    -1,
    mb_z
    ) = 0
mod(
x
,
y
)
is the integer remainder of
x
/
y
.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804