Developer Reference

  • 0.9
  • 09/09/2020
  • Public Content
Contents

p?trmm

Computes a scalar-matrix-matrix product (one matrix operand is triangular) for distributed matrices.

Syntax

void pstrmm
(
const char
*side
,
const char
*uplo
,
const char
*transa
,
const char
*diag
,
const MKL_INT
*m
,
const MKL_INT
*n
,
const float
*alpha
,
const float
*a
,
const MKL_INT
*ia
,
const MKL_INT
*ja
,
const MKL_INT
*desca
,
float
*b
,
const MKL_INT
*ib
,
const MKL_INT
*jb
,
const MKL_INT
*descb
);
void pdtrmm
(
const char
*side
,
const char
*uplo
,
const char
*transa
,
const char
*diag
,
const MKL_INT
*m
,
const MKL_INT
*n
,
const double
*alpha
,
const double
*a
,
const MKL_INT
*ia
,
const MKL_INT
*ja
,
const MKL_INT
*desca
,
double
*b
,
const MKL_INT
*ib
,
const MKL_INT
*jb
,
const MKL_INT
*descb
);
void pctrmm
(
const char
*side
,
const char
*uplo
,
const char
*transa
,
const char
*diag
,
const MKL_INT
*m
,
const MKL_INT
*n
,
const MKL_Complex8
*alpha
,
const MKL_Complex8
*a
,
const MKL_INT
*ia
,
const MKL_INT
*ja
,
const MKL_INT
*desca
,
MKL_Complex8
*b
,
const MKL_INT
*ib
,
const MKL_INT
*jb
,
const MKL_INT
*descb
);
void pztrmm
(
const char
*side
,
const char
*uplo
,
const char
*transa
,
const char
*diag
,
const MKL_INT
*m
,
const MKL_INT
*n
,
const MKL_Complex16
*alpha
,
const MKL_Complex16
*a
,
const MKL_INT
*ia
,
const MKL_INT
*ja
,
const MKL_INT
*desca
,
MKL_Complex16
*b
,
const MKL_INT
*ib
,
const MKL_INT
*jb
,
const MKL_INT
*descb
);
Include Files
  • mkl_pblas.h
Description
The
p?trmm
routines perform a matrix-matrix operation using triangular matrices. The operation is defined as
sub(B) := alpha*op(sub(A))*sub(B)
or
sub(B) := alpha*sub(B)*op(sub(A))
where:
alpha
is a scalar,
sub(
B
)
is an
m
-by-
n
distributed matrix,
sub(
B
)=
B
(
ib
:
ib
+
m
-1,
jb
:
jb
+
n
-1)
.
A
is a unit, or non-unit, upper or lower triangular distributed matrix,
sub(
A
)=
A
(
ia
:
ia
+
m
-1,
ja
:
ja
+
m
-1)
, if
side
= '
L
'
or
'
l
'
, and
sub(
A
)=
A
(
ia
:
ia
+
n
-1,
ja
:
ja
+
n
-1)
, if
side
= '
R
'
or
'
r
'
.
op(sub(
A
))
is one of
op(sub(
A
)) = sub(
A
)
, or
op(sub(
A
)) = sub(
A
)'
, or
op(sub(
A
)) = conjg(sub(
A
)')
.
Input Parameters
side
(global) Specifies whether
op(sub(
A
))
appears on the left or right of
sub(
B
)
in the operation:
if
side
=
'L'
or
'l'
, then
sub(
B
) :=
alpha
*op(sub(
A
))*sub(
B
)
;
if
side
=
'R'
or
'r'
, then
sub(
B
) :=
alpha
*sub(
B
)*op(sub(
A
))
.
uplo
(global) Specifies whether the distributed matrix
sub(
A
)
is upper or lower triangular:
if
uplo
=
'U'
or
'u'
, then the matrix is upper triangular;
if
uplo
=
'L'
or
'l'
, then the matrix is low triangular.
transa
(global) Specifies the form of
op(sub(
A
))
used in the matrix multiplication:
if
transa
= '
N
'
or
'
n
'
, then
op(sub(
A
)) = sub(
A
)
;
if
transa
= '
T
'
or
'
t
'
, then
op(sub(
A
)) = sub(
A
)'
;
if
transa
= '
C
'
or
'
c
'
, then
op(sub(
A
)) = conjg(sub(
A
)')
.
diag
(global) Specifies whether the matrix
sub(
A
)
is unit triangular:
if
diag
=
'U'
or
'u'
then the matrix is unit triangular;
if
diag
=
'N'
or
'n'
, then the matrix is not unit triangular.
m
(global) Specifies the number of rows of the distributed matrix
sub(
B
)
,
m
0.
n
(global) Specifies the number of columns of the distributed matrix
sub(
B
)
,
n
0.
alpha
(global)
Specifies the scalar
alpha
.
When
alpha
is zero, then the array
b
need not be set before entry.
a
(local)
Array, size
lld_a
by
ka
, where
ka
is at least
LOCq(1,
ja
+
m
-1)
when
side
=
'L'
or
'l'
and is at least
LOCq(1,
ja
+
n
-1)
when
side
=
'R'
or
'r'
.
Before entry with
uplo
=
'U'
or
'u'
, this array contains the local entries corresponding to the entries of the upper triangular distributed matrix
sub(
A
)
, and the local entries corresponding to the entries of the strictly lower triangular part of the distributed matrix
sub(
A
)
is not referenced.
Before entry with
uplo
=
'L'
or
'l'
, this array contains the local entries corresponding to the entries of the lower triangular distributed matrix
sub(
A
)
, and the local entries corresponding to the entries of the strictly upper triangular part of the distributed matrix
sub(
A
)
is not referenced .
When
diag
=
'U'
or
'u'
, the local entries corresponding to the diagonal elements of the submatrix
sub(
A
)
are not referenced either, but are assumed to be unity.
ia
,
ja
(global) The row and column indices in the distributed matrix
A
indicating the first row and the first column of the submatrix
sub(
A
)
, respectively.
desca
(global and local) array of dimension 9. The array descriptor of the distributed matrix
A
.
b
(local)
Array, size (
lld_b
,
LOCq(1,
jb
+
n
-1)
).
Before entry, this array contains the local pieces of the distributed matrix
sub(
B
)
.
ib
,
jb
(global) The row and column indices in the distributed matrix
B
indicating the first row and the first column of the submatrix
sub(
B
)
, respectively.
descb
(global and local) array of dimension 9. The array descriptor of the distributed matrix
B
.
Output Parameters
b
Overwritten by the transformed distributed matrix.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804