Developer Reference for Intel® oneAPI Math Kernel Library for C

ID 766684
Date 11/07/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

PBLAS Routines Overview

The model of the computing environment for PBLAS is represented as a one-dimensional array of processes or also a two-dimensional process grid. To use PBLAS, all global matrices or vectors must be distributed on this array or grid prior to calling the PBLAS routines.

PBLAS uses the two-dimensional block-cyclic data distribution as a layout for dense matrix computations. This distribution provides good work balance between available processors, as well as gives the opportunity to use PBLAS Level 3 routines for optimal local computations. Information about the data distribution that is required to establish the mapping between each global array and its corresponding process and memory location is contained in the so called array descriptor associated with each global array. Table "Content of the array descriptor for dense matrices" gives an example of an array descriptor structure.

Content of Array Descriptor for Dense Matrices
Array Element # Name Definition
1 dtype Descriptor type ( =1 for dense matrices)
2 ctxt BLACS context handle for the process grid
3 m Number of rows in the global array
4 n Number of columns in the global array
5 mb Row blocking factor
6 nb Column blocking factor
7 rsrc Process row over which the first row of the global array is distributed
8 csrc Process column over which the first column of the global array is distributed
9 lld Leading dimension of the local array

The number of rows and columns of a global dense matrix that a particular process in a grid receives after data distributing is denoted by LOCr() and LOCc(), respectively. To compute these numbers, you can use the ScaLAPACK tool routine numroc.

After the block-cyclic distribution of global data is done, you may choose to perform an operation on a submatrix of the global matrix A, which is contained in the global subarray sub(A), defined by the following 6 values (for dense matrices):

m

The number of rows of sub(A)

n

The number of columns of sub(A)

a

A pointer to the local array containing the entire global array A

ia

The row index of sub(A) in the global array

ja

The column index of sub(A) in the global array

desca

The array descriptor for the global array A

Intel® oneAPI Math Kernel Library (oneMKL) provides the PBLAS routines with interface similar to the interface used in the Netlib PBLAS (see http://www.netlib.org/scalapack/html/pblas_qref.html).