cblas_?gemm_pack
cblas_?gemm_pack
Performs scaling and packing of the matrix into the previously allocated buffer.
Syntax
void cblas_sgemm_pack
(
const
CBLAS_LAYOUT
Layout
,
const
CBLAS_IDENTIFIER
identifier
,
const
CBLAS_TRANSPOSE
trans
,
const
MKL_INT
m
,
const
MKL_INT
n
,
const
MKL_INT
k
,
const
float
alpha
,
const
float
*src
,
const
MKL_INT
ld
,
float
*dest
);
void cblas_dgemm_pack
(
const
CBLAS_LAYOUT
Layout
,
const
CBLAS_IDENTIFIER
identifier
,
const
CBLAS_TRANSPOSE
trans
,
const
MKL_INT
m
,
const
MKL_INT
n
,
const
MKL_INT
k
,
const
double
alpha
,
const
double
*src
,
const
MKL_INT
ld
,
double
*dest
);
Include Files
- mkl.h
Description
The
cblas_?gemm_pack
routine is one of a set of related routines
that enable use of an internal packed storage.
Call cblas_?gemm_pack
after you allocate a buffer whose size is given by cblas_?gemm_pack_getsize
. The cblas_?gemm_pack
routine scales the identified matrix by alpha and packs it into the buffer allocated previously. Do not copy the packed matrix to a different address because the internal implementation depends on the alignment of internally-stored metadata.
The
cblas_?gemm_pack
routine performs this operation:dest
:= alpha
*op(src
)C
:= alpha
*op(A
)*op(B
) + beta
*C
where:
op(
is one of the operations X
) op(
, X
) = X
op(
, or X
) = X
T
op(
,X
) = X
H
alpha
and
beta
are scalars,src
is a matrix,A
, B
, and
C
are matricesop(
is an src
)m
-by-k
matrix if identifier
= CblasAMatrix
,op(
is a src
)k
-by-n
matrix if identifier
= CblasBMatrix
,dest
is an internal packed storage buffer.You must use the same value of the
Layout
parameter for the entire sequence of related cblas_?gemm_pack
and cblas_?gemm_compute
calls. For best performance, use the same number of threads for packing and for computing.
If packing for both
A
and B
matrices, you must use the same number of threads for packing A
as for packing B
.Input Parameters
- Layout
- Specifies whether two-dimensional array storage is row-major (CblasRowMajor) or column-major (CblasColMajor).
- identifier
- Specifies which matrix is to be packed:Ifidentifier=CblasAMatrix, the routine allocates storage to pack matrixA.Ifidentifier=CblasBMatrix, the routine allocates storage to pack matrixB.
- trans
- Specifies the form ofop(used in the packing:src)Iftrans=CblasNoTransop(.src) =srcIftrans=CblasTransop(.src) =srcTIftrans=CblasConjTransop(.src) =srcH
- m
- Specifies the number of rows of the matrixop(and of the matrixA)C. The value ofmmust be at least zero.
- n
- Specifies the number of columns of the matrixop(and the number of columns of the matrixB)C. The value ofnmust be at least zero.
- k
- Specifies the number of columns of the matrixop(and the number of rows of the matrixA)op(. The value ofB)kmust be at least zero.
- alpha
- Specifies the scalaralpha.
- src
- Array:identifier=CblasAMatrixidentifier=CblasBMatrixtrans=CblasNoTranstrans=CblasTransortrans=CblasConjTranstrans=CblasNoTranstrans=CblasTransortrans=CblasConjTransLayout=CblasColMajorSize.ld*kBefore entry, the leadingm-by-kpart of the arraysrcmust contain the matrixA.Size.ld*mBefore entry, the leadingk-by-mpart of the arraysrcmust contain the matrixA.Size.ld*nBefore entry, the leadingk-by-npart of the arraysrcmust contain the matrixB.Size.ld*kBefore entry, the leadingn-by-kpart of the arraysrcmust contain the matrixB.Layout=CblasRowMajorSize.ld*mBefore entry, the leadingk-by-mpart of the arraysrcmust contain the matrixA.Size.ld*kBefore entry, the leadingm-by-kpart of the arraysrcmust contain the matrixA.Size.ld*kBefore entry, the leadingn-by-kpart of the arraysrcmust contain the matrixB.Size.ld*nBefore entry, the leadingk-by-npart of the arraysrcmust contain the matrixB.
- ld
- Specifies the leading dimension ofsrcas declared in the calling (sub)program.identifier=CblasAMatrixidentifier=CblasBMatrixtrans=CblasNoTranstrans=CblasTransortrans=CblasConjTranstrans=CblasNoTranstrans=CblasTransortrans=CblasConjTransLayout=CblasColMajorldmust be at leastmax(1,.m)ldmust be at leastmax(1,.k)ldmust be at leastmax(1,.k)ldmust be at leastmax(1,.n)Layout=CblasRowMajorldmust be at leastmax(1,.k)ldmust be at leastmax(1,.m)ldmust be at leastmax(1,.n)ldmust be at leastmax(1,.k)
- dest
- Scaled and packed internal storage buffer.
Output Parameters
- dest
- Overwritten by the matrix.alpha*op(src)