mkl_?omatadd_batch

Developer Reference for Intel® oneAPI Math Kernel Library for C

Download PDF

ID 766684

Date 11/07/2023

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-D1C705A7-CDDD-4006-AD3E-5846CF69FC2A

View Details

mkl_?omatadd_batch_strided

Computes a group of out-of-place scaled matrix additions using general matrices.

void mkl_somatadd_batch_strided(char ordering, char transa, char transb, size_t rows, size_t cols, float alpha, const float * A, size_t lda, size_t stridea, float beta, const float * B, size_t ldb, size_t strideb, float * C, size_t ldc, size_t stridec, size_t batch_size);

void mkl_domatadd_batch_strided(char ordering, char transa, char transb, size_t rows, size_t cols, double alpha, const double * A, size_t lda, size_t stridea, double beta, const double * B, size_t ldb, size_t strideb, double * C, size_t ldc, size_t stridec, size_t batch_size);

void mkl_comatadd_batch_strided(char ordering, char transa, char transb, size_t rows, size_t cols, MKL_Complex8 alpha, const MKL_Complex8 * A, size_t lda, size_t stridea, MKL_Complex8 beta, const MKL_Complex8 * B, size_t ldb, size_t strideb, MKL_Complex8 * C, size_t ldc, size_t stridec, size_t batch_size);

void mkl_zomatadd_batch_strided(char ordering, char transa, char transb, size_t rows, size_t cols, MKL_Complex16 alpha, const MKL_Complex16 * A, size_t lda, size_t stridea, MKL_Complex16 beta, const MKL_Complex16 * B, size_t ldb, size_t strideb, MKL_Complex16 * C, size_t ldc, size_t stridec, size_t batch_size);

Description

The mkl_omatadd_batch_strided routines perform a series of scaled matrix additions. They are similar to the mkl_omatadd routines, but the mkl_omatadd_batch_strided routines perform matrix operations with a group of matrices.

The matrices A, B, and C are stored at a constant stride from each other in memory, given by the parameters stridea, strideb, and stridec. The operation is defined as:

for i = 0 … batch_size – 1
    A is a matrix at offset i * stridea in the array a
    B is a matrix at offset i * strideb in the array b
    C is a matrix at offset i * stridec in the array c
    C = alpha * op(A) + beta * op(B)
end for

where:

op(X) is one of op(X) = X, op(X) = X', op(X) = conjg(X) or op(X) = conjg(X').
alpha and beta are scalars.
A, B, and C are matrices.

The input arrays a and b contain all the input matrices, and the single output array c contains all the output matrices. The locations of the individual matrices within the array are given by stride lengths, while the number of matrices is given by the batch_size parameter.

Input Parameters

layout: Specifies whether two-dimensional array storage is row-major (CblasRowMajor) or column-major (CblasColMajor).
transa: Specifies op(A), the transposition operation applied to the matrices A. 'N' or 'n' indicates no operation, 'T' or 't' is transposition, 'R' or 'r' is complex conjugation wtihout tranpsosition, and 'C' or 'c' is conjugate transposition.
transb: Specifies op(B), the transposition operation applied to the matrices B.
rows: Number of rows for the result matrix C. Must be at least zero.
cols: Number of columns for the result matrix C. Must be at least zero.
alpha: Scaling factor for the matrices A.
a: Array holding the input matrices A. Must have size at least stride_a*batch_size.
lda: Leading dimension of the A matrices. If matrices are stored using column major layout, lda must be at least rows if A is not transposed or cols if A is transposed. If matrices are stored using row major layout, lda must be at least cols if A is not transposed or at least rows if A is transposed. Must be positive.
stride_a: Stride between the different A matrices. If matrices are stored using column major layout, stride_a must be at least lda*rows if A is not transposed or at least lda*cols if A is transposed. If matrices are stored using row major layout, stride_a must be at least lda*rows if B is not transposed or at least lda*cols if A is transposed.
beta: Scaling factor for the matrices B.
b: Array holding the input matrices B. Must have size at least stride_b*batch_size.
ldb: Leading dimension of the B matrices. If matrices are stored using column major layout, ldb must be at least rows if B is not transposed or cols if B is transposed. If matrices are stored using row major layout, ldb must be at least cols if B is not transposed or at least rows if B is transposed. Must be positive.
stride_b: Stride between the different B matrices. If matrices are stored using column major layout, stride_b must be at least ldb*cols if B is not transposed or at least ldb*rows if B is transposed. If matrices are stored using row major layout, stride_b must be at least ldb*rows if B is not transposed or at least ldb*cols if B is transposed.
c: Output array, overwritten by batch_size matrix addition operations of the form alpha*op(A) + beta*op(B). Must have size at least stride_c*batch_size.
ldc: Leading dimension of the A matrices. If matrices are stored using column major layout, lda must be at least rows. If matrices are stored using row major layout, lda must be at least cols. Must be positive.
stride_c: Stride between the different C matrices. If matrices are stored using column major layout, stride_c must be at least ldc*cols. If matrices are stored using row major layout, stride_c must be at least ldc*rows.
batch_size: Specifies the number of input and output matrices to add.

Output Parameters

c: Array holding the updated matrices C.

Parent topic: BLAS-like Extensions

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Developer Reference for Intel® oneAPI Math Kernel Library for C

mkl_?omatadd_batch_strided

Description

Input Parameters

Output Parameters