Developer Reference

  • 0.9
  • 09/09/2020
  • Public Content
Contents

cblas_?gemm_batch_strided

Computes groups of matrix-matrix product with general matrices.

Syntax

void
cblas_sgemm_batch_strided
(
const
CBLAS_LAYOUT
layout
,
const
CBLAS_TRANSPOSE
transa
,
const
CBLAS_TRANSPOSE
transb
,
const
MKL_INT
m
,
const
MKL_INT
n
,
const
MKL_INT
k
,
const
float
alpha
,
const
float
*a
,
const
MKL_INT
lda
,
const
MKL_INT
stridea
,
const
float
*b
,
const
MKL_INT
ldb
,
const
MKL_INT
strideb
,
const
float
beta
,
float
*c
,
const
MKL_INT
ldc
,
const
MKL_INT
stridec
,
const MKL_INT
batch_size
);
void
cblas_dgemm_batch_strided
(
const
CBLAS_LAYOUT
layout
,
const
CBLAS_TRANSPOSE
transa
,
const
CBLAS_TRANSPOSE
transb
,
const
MKL_INT
m
,
const
MKL_INT
n
,
const
MKL_INT
k
,
const
double
alpha
,
const
double
*a
,
const
MKL_INT
lda
,
const
MKL_INT
stridea
,
const
double
*b
,
const
MKL_INT
ldb
,
const
MKL_INT
strideb
,
const
double
beta
,
double
*c
,
const
MKL_INT
ldc
,
const
MKL_INT
stridec
,
const MKL_INT
batch_size
);
void
cblas_cgemm_batch_strided
(
const
CBLAS_LAYOUT
layout
,
const
CBLAS_TRANSPOSE
transa
,
const
CBLAS_TRANSPOSE
transb
,
const
MKL_INT
m
,
const
MKL_INT
n
,
const
MKL_INT
k
,
const
void
*alpha
,
const
void
*a
,
const
MKL_INT
lda
,
const
MKL_INT
stridea
,
const
void
*b
,
const
MKL_INT
ldb
,
const
MKL_INT
strideb
,
const
void
*beta
,
void
*c
,
const
MKL_INT
ldc
,
const
MKL_INT
stridec
,
const MKL_INT
batch_size
);
void
cblas_zgemm_batch_strided
(
const
CBLAS_LAYOUT
layout
,
const
CBLAS_TRANSPOSE
transa
,
const
CBLAS_TRANSPOSE
transb
,
const
MKL_INT
m
,
const
MKL_INT
n
,
const
MKL_INT
k
,
const
void
*alpha
,
const
void
*a
,
const
MKL_INT
lda
,
const
MKL_INT
stridea
,
const
void
*b
,
const
MKL_INT
ldb
,
const
MKL_INT
strideb
,
const
void
*beta
,
void
*c
,
const
MKL_INT
ldc
,
const
MKL_INT
stridec
,
const MKL_INT
batch_size
);
Include Files
  • mkl.h
Description
The
cblas_?gemm_batch_strided
routines perform a series of matrix-matrix operations with general matrices. They are similar to the
cblas_?gemm
routine counterparts, but the
cblas_?gemm_batch_strided
routines perform matrix-matrix operations with groups of matrices. The groups contain matrices with the same parameters.
All matrix
a
(respectively,
b
or
c
) have the same parameters (size, leading dimension, transpose operation, alpha, beta scaling) and are stored at constant
stridea
(respectively,
strideb
or
stridec
) from each other. The operation is defined as
For i = 0 … batch_size – 1 Ai, Bi and Ci are matrices at offset i * stridea, i * strideb and i * stridec in a, b and c Ci = alpha * Ai * Bi + beta * Ci end for
Input Parameters
layout
Specifies whether two-dimensional array storage is row-major (CblasRowMajor) or column-major (CblasColMajor).
transa
Specifies op(A) the transposition operation applied to the matrices
A
.
if
transa
=
CblasNoTrans
, then op(A) = A;
if
transa
=
CblasTrans
, then op(A) = A
T
;
if
transa
=
CblasConjTrans
, then op(A) = A
H
.
transb
Specifies op(B) the transposition operation applied to the matrices
B
.
if
transb
=
CblasNoTrans
, then op(B) = B;
if
transb
=
CblasTrans
, then op(B) = B
T
;
if
transb
=
CblasConjTrans
, then op(B) = B
H
.
m
Number of rows of the op(A) and
C
matrices. Must be at least 0.
n
Number of columns of the op(B) and
C
matrices. Must be at least 0.
k
Number of columns of the op(A) matrix and number of rows of the op(B) matrix. Must be at least 0.
alpha
Specifies the scalar
alpha
.
a
Array of size at least
stridea
*
batch_size
holding the
a
matrices.
transa
=
CblasNoTrans
transa
=
CblasTrans or CblasConjTrans
layout
=
CblasColMajor
Before entry, the leading
m
-by-
k
part of the array a + i *
stridea
must contain the matrix A
i
.
Before entry, the leading
k
-by-
m
part of the array a + i *
stridea
must contain the matrix A
i
.
layout
=
CblasRowMajor
Before entry, the leading
k
-by-
m
part of the array a + i *
stridea
must contain the matrix A
i
.
Before entry, the leading
m
-by-
k
part of the array a + i *
stridea
must contain the matrix A
i
.
lda
Specifies the leading dimension of the
a
matrices.
transa
=
CblasNoTrans
transa
=
CblasTrans or CblasConjTrans
layout
=
CblasColMajor
lda
must be at least max(1,
m
)
lda
must be at least max(1,
k
).
layout
=
CblasRowMajor
lda
must be at least max(1,
k
).
lda
must be at least max(1,
m
)
stridea
Stride between two consecutive
a
matrices.
transa
=
CblasNoTrans
transa
=
CblasTrans or CblasConjTrans
layout
=
CblasColMajor
Must be at least
lda
*
k
Must be at least
lda
*
m
layout
=
CblasRowMajor
Must be at least
lda
*
m
Must be at least
lda
*
k
b
Array of size at least
strideb
*
batch_size
holding the
b
matrices.
transb
=
CblasNoTrans
transb
=
CblasTrans or CblasConjTrans
layout
=
CblasColMajor
Before entry, the leading
k
-by-
n
part of the array b + i *
strideb
must contain the matrix B
i
.
Before entry, the leading
n
-by-
k
part of the array b + i *
strideb
must contain the matrix B
i
.
layout
=
CblasRowMajor
Before entry, the leading
n
-by-
k
part of the array b + i *
strideb
must contain the matrix B
i
.
Before entry, the leading
k
-by-
n
part of the array b + i *
strideb
must contain the matrix B
i
.
ldb
Specifies the leading dimension of the
b
matrices.
transab
=
CblasNoTrans
transb
=
CblasTrans or CblasConjTrans
layout
=
CblasColMajor
ldb
must be at least max(1,
k
)
ldb
must be at least max(1,
n
).
layout
=
CblasRowMajor
ldb
must be at least max(1,
n
).
ldb
must be at least max(1,
k
)
strideb
Stride between two consecutive
b
matrices.
transa
=
CblasNoTrans
transa
=
CblasTrans or CblasConjTrans
layout
=
CblasColMajor
Must be at least
ldb
*
n
Must be at least
ldb
*
k
layout
=
CblasRowMajor
Must be at least
ldb
*
k
Must be at least
ldb
*
n
beta
Specifies the scalar
beta
.
c
Array of size at least
stridec
*
batch_size
holding the
c
matrices.
If layout=CblasColMajor, before entry, the leading
m
-by-
n
part of the array
c
+ i *
stridec
must contain the matrix
C
i
.
If layout=CblasRowMajor, before entry, the leading
n
-by-
m
part of the array
c
+ i *
stridec
must contain the matrix
C
i
.
ldc
Specifies the leading dimension of the
c
matrices.
Must be at least max(1,
m
)
if layout=CblasColMajor or max(1,
n
) if layout=CblasRowMajor
.
stridec
Specifies the stride between two consecutive
c
matrices.
Must be at least ldc*
n
if layout=CblasColMajor or ldc*
m
if layout=CblasRowMajor
.
batch_size
Number of
gemm
computations to perform and
a
,
b
and
c
matrices. Must be at least 0.
Output Parameters
c
Array holding the
batch_size
updated
c
matrices.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent opti