Contents

# cblas_?gemm_batch

Computes
scalar-matrix-matrix products and adds the results to scalar matrix products for groups of
general matrices.

## Syntax

Include Files
• mkl.h
Description
The
?gemm_batch
routines perform a series of matrix-matrix operations with general matrices. They are similar to the
?gemm
routine counterparts, but the
?gemm_batch
routines perform matrix-matrix operations with groups of matrices
, processing a number of groups at once
. The groups contain matrices with the same parameters.
The operation is defined as
```idx = 0
for i = 0..group_count - 1
alpha and beta in alpha_array[i] and beta_array[i]
for j = 0..group_size[i] - 1
A, B, and C matrix in a_array[idx], b_array[idx], and c_array[idx]
C := alpha*op(A)*op(B) + beta*C,
idx = idx + 1
end for
end for
```
where:
op(
X
)
is one of
op(
X
) =
X
, or
op(
X
) =
X
T
, or
op(
X
) =
X
H
,
alpha
and
beta
are scalar elements of
alpha_array
and
beta_array
,
A
,
B
and
C
are matrices such that for
m
,
n
, and
k
which are elements of
m_array
,
n_array
, and
k_array
:
op(
A
)
is an
m
-by-
k
matrix,
op(
B
)
is a
k
-by-
n
matrix,
C
is an
m
-by-
n
matrix.
A
,
B
, and
C
represent matrices stored at addresses pointed to by
a_array
,
b_array
, and
c_array
, respectively. The number of entries in
a_array
,
b_array
, and
c_array
is
total_batch_count
= the sum of all of the
group_size
entries.
See also gemm for a detailed description of multiplication for general matrices and ?gemm3m_batch, BLAS-like extension routines for similar matrix-matrix operations.
Error checking is not performed for
oneMKL
Windows* single dynamic libraries for the
?gemm_batch
routines.
Input Parameters
Layout
Specifies whether two-dimensional array storage is row-major (
CblasRowMajor
) or column-major (
CblasColMajor
).
transa_array
Array of size
group_count
. For the group
i
,
transa
i
=
transa_array
[
i
]
specifies the form of
op(
A
)
used in the matrix multiplication:
if
transa
i
=
CblasNoTrans
, then
op(
A
) =
A
;
if
transa
i
=
CblasTrans
, then
op(
A
) =
A
T
;
if
transa
i
=
CblasConjTrans
, then
op(
A
) =
A
H
.
transb_array
Array of size
group_count
. For the group
i
,
transb
i
=
transb_array
[
i
]
specifies the form of
op(
B
i
)
used in the matrix multiplication:
if
transb
i
=
CblasNoTrans
, then
op(
B
) =
B
;
if
transb
i
=
CblasTrans
, then
op(
B
) =
B
T
;
if
transb
i
=
CblasConjTrans
, then
op(
B
) =
B
H
.
m_array
Array of size
group_count
. For the group
i
,
m
i
=
m_array
[
i
]
specifies the number of rows of the matrix
op(
A
)
and of the matrix
C
.
The value of each element of
m_array
must be at least zero.
n_array
Array of size
group_count
. For the group
i
,
n
i
=
n_array
[
i
]
specifies the number of columns of the matrix
op(
B
)
and the number of columns of the matrix
C
.
The value of each element of
n_array
must be at least zero.
k_array
Array of size
group_count
. For the group
i
,
k
i
=
k_array
[
i
]
specifies the number of columns of the matrix
op(
A
)
and the number of rows of the matrix
op(
B
)
.
The value of each element of
k_array
must be at least zero.
alpha_array
Array of size
group_count
. For the group
i
,
alpha_array
[
i
]
specifies the scalar
alpha
i
.
a_array
Array, size
total_batch_count
, of pointers to arrays used to store
A
matrices.
lda_array
Array of size
group_count
. For the group
i
,
lda
i
=
lda_array
[
i
]
specifies the leading dimension of the array storing matrix
A
as declared in the calling (sub)program.
 transai=CblasNoTrans transai=CblasTrans or transai=CblasConjTrans Layout = CblasColMajor ldai must be at least max(1, mi). ldai must be at least max(1, ki) Layout = CblasRowMajor ldai must be at least max(1, ki) ldai must be at least max(1, mi).
b_array
Array, size
total_batch_count
, of pointers to arrays used to store
B
matrices.
ldb_array
Array of size
group_count
. For the group
i
,
ldb
i
=
ldb_array
[
i
]
specifies the leading dimension of the array storing matrix
B
as declared in the calling (sub)program.
 transbi=CblasNoTrans transbi=CblasTrans or transbi=CblasConjTrans Layout = CblasColMajor ldbi must be at least max(1, ki). ldbi must be at least max(1, ni). Layout = CblasRowMajor ldbi must be at least max(1, ni). ldbi must be at least max(1, ki).
beta_array
Array of size
group_count
. For the group
i
,
beta_array
[
i
]
specifies the scalar
beta
i
.
When
beta
i
is equal to zero, then
C
matrices in group
i
need not be set on input.
c_array
Array, size
total_batch_count
, of pointers to arrays used to store
C
matrices.
ldc_array
Array of size
group_count
. For the group
i
,
ldc
i
=
ldc_array
[
i
]
specifies the leading dimension of all arrays storing matrix
C
in group
i
as declared in the calling (sub)program.
When
Layout
=
CblasColMajor
ldc
i
must be at least
max(1,
m
i
)
.
When
Layout
=
CblasRowMajor
ldc
i
must be at least
max(1,
n
i
)
.
group_count
Specifies the number of groups. Must be at least 0.
group_size
Array of size
group_count
. The element
group_size
[
i
]
specifies the number of matrices in group
i
. Each element in
group_size
must be at least 0.
Output Parameters
c_array
Output buffer, overwritten by
total_batch_count
matrix multiply operations of the form
alpha*op(A)*op(B) + beta*C
.

#### Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804