PBLAS Routine Naming Conventions
The naming convention for PBLAS routines is similar to that used for BLAS routines (see Routine Naming Conventions ). A general rule is that each routine name in PBLAS, which has a BLAS equivalent, is simply the BLAS name prefixed by initial letter
pthat stands for "parallel".
PBLAS routine names have the following structure:
p <character> <name> <mod> ( )
character> field indicates the Fortran data type:
- real, single precision
- complex, single precision
- real, double precision
- complex, double precision
Some routines and functions can have combined character codes, such as
For example, the function
pscasumuses a complex input array and returns a real value.
<name>field, in PBLAS level 1, indicates the operation type. For example, the PBLAS level 1 routines
p?copycompute a vector dot product, vector swap, and a copy vector, respectively.
In PBLAS level 2 and 3,
<name>reflects the matrix argument type:
- general matrix
- symmetric matrix
- Hermitian matrix
- triangular matrix
In PBLAS level 3, the
indicates the transposition of the matrix.
<mod>field, if present, provides additional details of the operation. The PBLAS level 1 names can have the following characters in the
- conjugated vector
- unconjugated vector
The PBLAS level 2 names can have the following additional characters in the
- matrix-vector product
- solving a system of linear equations with matrix-vector operations
- rank-1 update of a matrix
- rank-2 update of a matrix.
The PBLAS level 3 names can have the following additional characters in the
- matrix-matrix product
- solving a system of linear equations with matrix-matrix operations
- rank-kupdate of a matrix
- rank-2kupdate of a matrix.
The examples below show how to interpret PBLAS routine names:
- <p> <d> <dot>: double-precision real distributed vector-vector dot product
- <p> <c> <dot> <c>: complex distributed vector-vector dot product, conjugated
- <p> <sc> <asum>: sum of magnitudes of distributed vector elements, single precision real output and single precision complex input
- <p> <c> <dot> <u>: distributed vector-vector dot product, unconjugated, complex
- <p> <s> <ge> <mv>: distributed matrix-vector product, general matrix, single precision
- <p> <z> <tr> <mm>: distributed matrix-matrix product, triangular matrix, double-precision complex.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
This notice covers the following instruction sets: SSE2, SSE4.2, AVX2, AVX-512.