cblas_?gemm_batch

Computes scalar-matrix-matrix products and adds the results to scalar matrix products for groups of general matrices.

Syntax

void cblas_sgemm_batch (const CBLAS_LAYOUT Layout, const CBLAS_TRANSPOSE* transa_array, const CBLAS_TRANSPOSE* transb_array, const MKL_INT* m_array, const MKL_INT* n_array, const MKL_INT* k_array, const float* alpha_array, const float **a_array, const MKL_INT* lda_array, const float **b_array, const MKL_INT* ldb_array, const float* beta_array, float **c_array, const MKL_INT* ldc_array, const MKL_INT group_count, const MKL_INT* group_size);

void cblas_dgemm_batch (const CBLAS_LAYOUT Layout, const CBLAS_TRANSPOSE* transa_array, const CBLAS_TRANSPOSE* transb_array, const MKL_INT* m_array, const MKL_INT* n_array, const MKL_INT* k_array, const double* alpha_array, const double **a_array, const MKL_INT* lda_array, const double **b_array, const MKL_INT* ldb_array, const double* beta_array, double **c_array, const MKL_INT* ldc_array, const MKL_INT group_count, const MKL_INT* group_size);

void cblas_cgemm_batch (const CBLAS_LAYOUT Layout, const CBLAS_TRANSPOSE* transa_array, const CBLAS_TRANSPOSE* transb_array, const MKL_INT* m_array, const MKL_INT* n_array, const MKL_INT* k_array, const void *alpha_array, const void **a_array, const MKL_INT* lda_array, const void **b_array, const MKL_INT* ldb_array, const void *beta_array, void **c_array, const MKL_INT* ldc_array, const MKL_INT group_count, const MKL_INT* group_size);

void cblas_zgemm_batch (const CBLAS_LAYOUT Layout, const CBLAS_TRANSPOSE* transa_array, const CBLAS_TRANSPOSE* transb_array, const MKL_INT* m_array, const MKL_INT* n_array, const MKL_INT* k_array, const void *alpha_array, const void **a_array, const MKL_INT* lda_array, const void **b_array, const MKL_INT* ldb_array, const void *beta_array, void **c_array, const MKL_INT* ldc_array, const MKL_INT group_count, const MKL_INT* group_size);

Include Files

  • mkl.h

Description

The ?gemm_batch routines perform a series of matrix-matrix operations with general matrices. They are similar to the ?gemm routine counterparts, but the ?gemm_batch routines perform matrix-matrix operations with groups of matrices, processing a number of groups at once. The groups contain matrices with the same parameters.

The operation is defined as

idx = 0
for i = 0..group_count - 1
     alpha and beta in alpha_array[i] and beta_array[i]
     for j = 0..group_size[i] - 1 
          A, B, and C matrix in a_array[idx], b_array[idx], and c_array[idx]
          C := alpha*op(A)*op(B) + beta*C,
          idx = idx + 1
     end for
 end for

where:

op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH,

alpha and beta are scalar elements of alpha_array and beta_array,

A, B and C are matrices such that for m, n, and k which are elements of m_array, n_array, and k_array:

op(A) is an m-by-k matrix,

op(B) is a k-by-n matrix,

C is an m-by-n matrix.

A, B, and C represent matrices stored at addresses pointed to by a_array, b_array, and c_array, respectively. The number of entries in a_array, b_array, and c_array is total_batch_count = the sum of all of the group_size entries.

See also gemm for a detailed description of multiplication for general matrices and ?gemm3m_batch, BLAS-like extension routines for similar matrix-matrix operations.

Note

Error checking is not performed for Intel MKL Windows* single dynamic libraries for the ?gemm_batch routines.

Input Parameters

Layout

Specifies whether two-dimensional array storage is row-major (CblasRowMajor) or column-major (CblasColMajor).

transa_array

Array of size group_count. For the group i, transai = transa_array[i] specifies the form of op(A) used in the matrix multiplication:

if transai = CblasNoTrans, then op(A) = A;

if transai = CblasTrans, then op(A) = AT;

if transai = CblasConjTrans, then op(A) = AH.

transb_array

Array of size group_count. For the group i, transbi = transb_array[i] specifies the form of op(Bi) used in the matrix multiplication:

if transbi = CblasNoTrans, then op(B) = B;

if transbi = CblasTrans, then op(B) = BT;

if transbi = CblasConjTrans, then op(B) = BH.

m_array

Array of size group_count. For the group i, mi = m_array[i] specifies the number of rows of the matrix op(A) and of the matrix C.

The value of each element of m_array must be at least zero.

n_array

Array of size group_count. For the group i, ni = n_array[i] specifies the number of columns of the matrix op(B) and the number of columns of the matrix C.

The value of each element of n_array must be at least zero.

k_array

Array of size group_count. For the group i, ki = k_array[i] specifies the number of columns of the matrix op(A) and the number of rows of the matrix op(B).

The value of each element of k_array must be at least zero.

alpha_array

Array of size group_count. For the group i, alpha_array[i] specifies the scalar alphai.

a_array

Array, size total_batch_count, of pointers to arrays used to store A matrices.

lda_array

Array of size group_count. For the group i, ldai = lda_array[i] specifies the leading dimension of the array storing matrix A as declared in the calling (sub)program.

 

transai=CblasNoTrans

transai=CblasTrans or transai=CblasConjTrans

Layout = CblasColMajor

ldai must be at least max(1, mi).

ldai must be at least max(1, ki)

Layout = CblasRowMajor

ldai must be at least max(1, ki)

ldai must be at least max(1, mi).

b_array

Array, size total_batch_count, of pointers to arrays used to store B matrices.

ldb_array

Array of size group_count. For the group i, ldbi = ldb_array[i] specifies the leading dimension of the array storing matrix B as declared in the calling (sub)program.

 

transbi=CblasNoTrans

transbi=CblasTrans or transbi=CblasConjTrans

Layout = CblasColMajor

ldbi must be at least max(1, ki).

ldbi must be at least max(1, ni).

Layout = CblasRowMajor

ldbi must be at least max(1, ni).

ldbi must be at least max(1, ki).

beta_array

Array of size group_count. For the group i, beta_array[i] specifies the scalar betai.

When betai is equal to zero, then C matrices in group i need not be set on input.

c_array

Array, size total_batch_count, of pointers to arrays used to store C matrices.

ldc_array

Array of size group_count. For the group i, ldci = ldc_array[i] specifies the leading dimension of all arrays storing matrix C in group i as declared in the calling (sub)program.

When Layout = CblasColMajorldci must be at least max(1, mi).

When Layout = CblasRowMajorldci must be at least max(1, ni).

group_count

Specifies the number of groups. Must be at least 0.

group_size

Array of size group_count. The element group_size[i] specifies the number of matrices in group i. Each element in group_size must be at least 0.

Output Parameters

c_array

Overwritten by the mi-by-ni matrix (alphai*op(A)*op(B) + betai*C) for group i.

For more complete information about compiler optimizations, see our Optimization Notice.
Select sticky button color: 
Orange (only for download buttons)