Developer Reference

Contents

p?gemm

Computes a scalar-matrix-matrix product and adds the result to a scalar-matrix product for distributed matrices.

Syntax

void psgemm
(
const char
*transa
,
const char
*transb
,
const MKL_INT
*m
,
const MKL_INT
*n
,
const MKL_INT
*k
,
const float
*alpha
,
const float
*a
,
const MKL_INT
*ia
,
const MKL_INT
*ja
,
const MKL_INT
*desca
,
const float
*b
,
const MKL_INT
*ib
,
const MKL_INT
*jb
,
const MKL_INT
*descb
,
const float
*beta
,
float
*c
,
const MKL_INT
*ic
,
const MKL_INT
*jc
,
const MKL_INT
*descc
);
void pdgemm
(
const char
*transa
,
const char
*transb
,
const MKL_INT
*m
,
const MKL_INT
*n
,
const MKL_INT
*k
,
const double
*alpha
,
const double
*a
,
const MKL_INT
*ia
,
const MKL_INT
*ja
,
const MKL_INT
*desca
,
const double
*b
,
const MKL_INT
*ib
,
const MKL_INT
*jb
,
const MKL_INT
*descb
,
const double
*beta
,
double
*c
,
const MKL_INT
*ic
,
const MKL_INT
*jc
,
const MKL_INT
*descc
);
void pcgemm
(
const char
*transa
,
const char
*transb
,
const MKL_INT
*m
,
const MKL_INT
*n
,
const MKL_INT
*k
,
const MKL_Complex8
*alpha
,
const MKL_Complex8
*a
,
const MKL_INT
*ia
,
const MKL_INT
*ja
,
const MKL_INT
*desca
,
const MKL_Complex8
*b
,
const MKL_INT
*ib
,
const MKL_INT
*jb
,
const MKL_INT
*descb
,
const MKL_Complex8
*beta
,
MKL_Complex8
*c
,
const MKL_INT
*ic
,
const MKL_INT
*jc
,
const MKL_INT
*descc
);
void pzgemm
(
const char
*transa
,
const char
*transb
,
const MKL_INT
*m
,
const MKL_INT
*n
,
const MKL_INT
*k
,
const MKL_Complex16
*alpha
,
const MKL_Complex16
*a
,
const MKL_INT
*ia
,
const MKL_INT
*ja
,
const MKL_INT
*desca
,
const MKL_Complex16
*b
,
const MKL_INT
*ib
,
const MKL_INT
*jb
,
const MKL_INT
*descb
,
const MKL_Complex16
*beta
,
MKL_Complex16
*c
,
const MKL_INT
*ic
,
const MKL_INT
*jc
,
const MKL_INT
*descc
);
Include Files
  • mkl_pblas.h
Description
The
p?gemm
routines perform a matrix-matrix operation with general distributed matrices. The operation is defined as
sub(C) := alpha*op(sub(A))*op(sub(B)) + beta*sub(C),
where:
op(
x
)
is one of
op(
x
) =
x
, or
op(
x
) =
x
'
,
alpha
and
beta
are scalars,
sub(
A
)=
A
(
ia
:
ia
+
m
-1,
ja
:
ja
+
k
-1)
,
sub(
B
)=
B
(
ib
:
ib
+
k
-1,
jb
:
jb
+
n
-1)
, and
sub(
C
)=
C
(
ic
:
ic
+
m
-1,
jc
:
jc
+
n
-1)
, are distributed matrices.
Input Parameters
transa
(global) Specifies the form of
op(sub(
A
))
used in the matrix multiplication:
if
transa
= 'N'
or
'n'
, then
op(sub(
A
)) = sub(
A
)
;
if
transa
= 'T'
or
't'
, then
op(sub(
A
)) = sub(
A
)'
;
if
transa
= 'C'
or
'c'
, then
op(sub(
A
)) = sub(
A
)'
.
transb
(global) Specifies the form of
op(sub(
B
))
used in the matrix multiplication:
if
transb
= 'N'
or
'n'
, then
op(sub(
B
)) = sub(
B
)
;
if
transb
= 'T'
or
't'
, then
op(sub(
B
)) = sub(
B
)'
;
if
transb
= 'C'
or
'c'
, then
op(sub(
B
)) = sub(
B
)'
.
m
(global) Specifies the number of rows of the distributed matrices
op(sub(
A
))
and
sub(
C
)
,
m
0.
n
(global) Specifies the number of columns of the distributed matrices
op(sub(
B
))
and
sub(
C
)
,
n
0.
The value of
n
must be at least zero.
k
(global) Specifies the number of columns of the distributed matrix
op(sub(
A
))
and the number of rows of the distributed matrix
op(sub(
B
))
.
The value of
k
must be greater than or equal to 0.
alpha
(global)
Specifies the scalar
alpha
.
When
alpha
is equal to zero, then the local entries of the arrays
a
and
b
corresponding to the entries of the submatrices
sub(
A
)
and
sub(
B
)
respectively need not be set on input.
a
(local)
Array, size
lld_a
by
kla
, where
kla
is
LOCc(
ja
+
k
-1)
when
transa
=
'N'
or
'n'
, and is
LOCq(
ja
+
m
-1)
otherwise. Before entry this array must contain the local pieces of the distributed matrix
sub(
A
)
.
ia
,
ja
(global) The row and column indices in the distributed matrix
A
indicating the first row and the first column of the submatrix
sub(
A
)
, respectively
desca
(global and local) array of dimension 9. The array descriptor of the distributed matrix
A
.
b
(local)
Array, size
lld_b
by
klb
, where
klb
is
LOCc(
jb
+
n
-1)
when
transb
=
'N'
or
'n'
, and is
LOCq(
jb
+
k
-1)
otherwise. Before entry this array must contain the local pieces of the distributed matrix
sub(
B
)
.
ib
,
jb
(global) The row and column indices in the distributed matrix
B
indicating the first row and the first column of the submatrix
sub(
B
)
, respectively
descb
(global and local) array of dimension 9. The array descriptor of the distributed matrix
B
.
beta
(global)
Specifies the scalar
beta
.
When
beta
is equal to zero, then
sub(
C
)
need not be set on input.
c
(local)
Array, size (
lld_a
,
LOCq(
jc
+
n
-1)
). Before entry this array must contain the local pieces of the distributed matrix
sub(
C
)
.
ic
,
jc
(global) The row and column indices in the distributed matrix
C
indicating the first row and the first column of the submatrix
sub(
C
)
, respectively
descc
(global and local) array of dimension 9. The array descriptor of the distributed matrix
C
.
Output Parameters
c
Overwritten by the
m
-by-
n
distributed matrix
alpha
*op(sub(
A
))*op(sub(
B
)) +
beta
*sub(
C
)
.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.