Developer Reference

  • 0.9
  • 09/09/2020
  • Public Content
Contents

cblas_?gemmt

Computes a matrix-matrix product with general matrices but updates only the upper or lower triangular part of the result matrix.

Syntax

void cblas_sgemmt
(
const
CBLAS_LAYOUT
Layout
,
const
CBLAS_UPLO
uplo
,
const
CBLAS_TRANSPOSE
transa
,
const
CBLAS_TRANSPOSE
transb
,
const
MKL_INT
n
,
const
MKL_INT
k
,
const
float
alpha
,
const
float
*a
,
const
MKL_INT
lda
,
const
float
*b
,
const
MKL_INT
ldb
,
const
float
beta
,
float
*c
,
const
MKL_INT
ldc
);
void cblas_dgemmt
(
const
CBLAS_LAYOUT
Layout
,
const
CBLAS_UPLO
uplo
,
const
CBLAS_TRANSPOSE
transa
,
const
CBLAS_TRANSPOSE
transb
,
const
MKL_INT
n
,
const
MKL_INT
k
,
const
double
alpha
,
const
double
*a
,
const
MKL_INT
lda
,
const
double
*b
,
const
MKL_INT
ldb
,
const
double
beta
,
double
*c
,
const
MKL_INT
ldc
);
void cblas_cgemmt
(
const
CBLAS_LAYOUT
Layout
,
const
CBLAS_UPLO
uplo
,
const
CBLAS_TRANSPOSE
transa
,
const
CBLAS_TRANSPOSE
transb
,
const
MKL_INT
n
,
const
MKL_INT
k
,
const
void
*alpha
,
const
void
*a
,
const
MKL_INT
lda
,
const
void
*b
,
const
MKL_INT
ldb
,
const
void
*beta
,
void
*c
,
const
MKL_INT
ldc
);
void cblas_zgemmt
(
const
CBLAS_LAYOUT
Layout
,
const
CBLAS_UPLO
uplo
,
const
CBLAS_TRANSPOSE
transa
,
const
CBLAS_TRANSPOSE
transb
,
const
MKL_INT
n
,
const
MKL_INT
k
,
const
void
*alpha
,
const
void
*a
,
const
MKL_INT
lda
,
const
void
*b
,
const
MKL_INT
ldb
,
const
void
*beta
,
void
*c
,
const
MKL_INT
ldc
);
Include Files
  • mkl.h
Description
The
?gemmt
routines compute a scalar-matrix-matrix product with general matrices and add the result to the upper or lower part of a scalar-matrix product. These routines are similar to the
?gemm
routines, but they only access and update a triangular part of the square result matrix (see Application Notes below).
The operation is defined as
C
:=
alpha
*op(
A
)*op(
B
) +
beta
*
C
,
where:
op(
X
)
is one of
op(
X
) =
X
, or
op(
X
) =
X
T
, or
op(
X
) =
X
H
,
alpha
and
beta
are scalars,
A
,
B
and
C
are matrices:
op(
A
)
is an
n
-by-
k
matrix,
op(
B
)
is a
k
-by-
n
matrix,
C
is an
n
-by-
n
upper or lower triangular matrix.
Input Parameters
Layout
Specifies whether two-dimensional array storage is row-major (
CblasRowMajor
) or column-major (
CblasColMajor
).
uplo
Specifies whether the upper or lower triangular part of the array
c
is used. If
uplo
=
'CblasUpper'
, then the upper triangular part of the array
c
is used. If
uplo
=
'CblasLower'
, then the lower triangular part of the array
c
is used.
transa
Specifies the form of
op(
A
)
used in the matrix multiplication:
if
transa
=
'CblasNoTrans'
, then
op(
A
) =
A
;
if
transa
=
'CblasTrans'
, then
op(
A
) =
A
T
;
if
transa
=
'CblasConjTrans'
, then
op(
A
) =
A
H
.
transb
Specifies the form of
op(
B
)
used in the matrix multiplication:
if
transb
=
'CblasNoTrans'
, then
op(
B
) =
B
;
if
transb
=
'CblasTrans'
, then
op(
B
) =
B
T
;
if
transb
=
'CblasConjTrans'
, then
op(
B
) =
B
H
.
n
Specifies the order of the matrix
C
. The value of
n
must be at least zero.
k
Specifies the number of columns of the matrix
op(
A
)
and the number of rows of the matrix
op(
B
)
. The value of
k
must be at least zero.
alpha
Specifies the scalar
alpha
.
a
transa
=
'CblasNoTrans'
transa
=
'CblasTrans' or
'CblasConjTrans'
Layout
=
'CblasColMajor'
Array, size
lda
*
k
. Before entry, the leading
n
-by-
k
part of the array
a
must contain the matrix
A
.
Array, size
lda
*
n
. Before entry, the leading
k
-by-
n
part of the array
a
must contain the matrix
A
.
Layout
=
'CblasRowMajor'
Array, size
lda
*
n
. Before entry, the leading
k
-by-
n
part of the array
a
must contain the matrix
A
.
Array, size
lda
*
k
. Before entry, the leading
n
-by-
k
part of the array
a
must contain the matrix
A
.
lda
Specifies the leading dimension of
a
as declared in the calling (sub)program.
transa
=
'CblasNoTrans'
transa
=
'CblasTrans' or
'CblasConjTrans'
Layout
=
'CblasColMajor'
lda
must be at least max(1,
n
).
lda
must be at least max(1,
k
).
Layout
=
'CblasRowMajor'
lda
must be at least max(1,
k
).
lda
must be at least max(1,
n
).
b
transb
=
'CblasNoTrans'
transb
=
'CblasTrans' or
'CblasConjTrans'
Layout
=
'CblasColMajor'
Array, size
ldb
*
n
. Before entry, the leading
k
-by-
n
part of the array
b
must contain the matrix
B
.
Array, size
ldb
*
k
. Before entry, the leading
n
-by-
k
part of the array
b
must contain the matrix
B
.
Layout
=
'CblasRowMajor'
Array, size
ldb
*
k
. Before entry, the leading
n
-by-
k
part of the array
b
must contain the matrix
B
.
Array, size
ldb
*
n
. Before entry, the leading
k
-by-
n
part of the array
b
must contain the matrix
B
.
ldb
Specifies the leading dimension of
b
as declared in the calling (sub)program.
transb
=
'CblasNoTrans'
transb
=
'CblasTrans' or
'CblasConjTrans'
Layout
=
'CblasColMajor'
ldb
must be at least max(1,
k
).
ldb
must be at least max(1,
n
).
Layout
=
'CblasRowMajor'
ldb
must be at least max(1,
n
).
ldb
must be at least max(1,
k
).
beta
Specifies the scalar
beta
. When
beta
is equal to zero, then
c
need not be set on input.
c
Array, size
ldc
by
n
.
When
beta
is equal to zero,
c
need not be set on input.
uplo
=
'CblasUpper'
uplo
=
'CblasLower'
The leading
n
-by-
n
upper triangular part of the array
c
must contain the upper triangular part of the matrix
C
and the strictly lower triangular part of
c
is not referenced.
The leading
n
-by-
n
lower triangular part of the array
c
must contain the lower triangular part of the matrix
C
and the strictly upper triangular part of
c
is not referenced.
ldc
Specifies the leading dimension of
c
as declared in the calling (sub)program. The value of
ldc
must be at least max(1,
n
).
Output Parameters
c
When
uplo
=
'CblasUpper'
, the upper triangular part of the array
c
is overwritten by the upper triangular part of the updated matrix.
When
uplo
=
'CblasLower'
, the lower triangular part of the array
c
is overwritten by the lower triangular part of the updated matrix.
Application Notes
These routines only access and update the upper or lower triangular part of the result matrix. This can be useful when the result is known to be symmetric; for example, when computing a product of the form
C
:=
alpha
*
B
*
S
*
B
T
+
beta
*
C
, where
S
and
C
are symmetric matrices and
B
is a general matrix. In this case, first compute
A
:=
B
*
S
(which can be done using the corresponding
?symm
routine), then compute
C
:=
alpha
*
A
*
B
T
+
beta
*
C
using the
?gemmt
routine.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804