Developer Reference

  • 0.9
  • 09/09/2020
  • Public Content
Contents

mkl_graph_mxm

Computes a (masked) graph matrix-matrix product.

Syntax

mkl_graph_status_t
mkl_graph_mxm (mkl_graph_matrix_t
C
,
mkl_graph_matrix_t
M
,
mkl_graph_accumulator_t
accum
,
mkl_graph_semiring_t
semiring
,
mkl_graph_matrix_t
A
,
mkl_graph_matrix_t
B
,
mkl_graph_descriptor_t
desc
,
mkl_graph_request_t
request
,
mkl_graph_method_t
method
);
Include Files
  • mkl_graph.h
Description
The
mkl_graph_mxm
routine computes a (masked) graph matrix-matrix product defined as
C<M> := accum(C, opA(A)*opB(B))
where
A
,
B
and
C
are input and output matrices respectively,
accum
is an optional binary operator to be used as an accumulator for the output matrix, and
M
is an optional mask for the output matrix. Possible modifications of the operations, including matrix modifiers
opA
and
opB
are defined through the operation descriptor
desc
.
The routine supports both single-stage and multistage execution modes via the
request
parameter. See Graph Operations for a description of the two modes. You can specify a specific method to be used for computations via the
method
parameter. Use
MKL_GRAPH_METHOD_AUTO
for an automatic choice.
The operation can be done in-place meaning that the output matrix can be aliased with any of the input matrices or the mask. However, in this case the original data will be replaced.
In the masked case, by default, the pattern of the output matrix (entries present) will be the intersection of the pattern of the mask and the pattern of the non-masked product of matrices
A
and
B
. If
MKL_GRAPH_MODIFIER_OUTPUT
is set to
MKL_GRAPH_KEEP_MASK_STRUCTURE
in the descriptor
desc
, the pattern of the output matrix will be exactly the pattern of the mask. The edges missing in the non-masked product are given value zero in the output data type.
For user control of output storage allocation, use multistage execution.
For maximum performance, the following configurations are recommended. For masked multiplication with a sparse mask, multiply CSR times CSC using
MKL_GRAPH_METHOD_DOT
or
MKL_GRAPH_METHOD_ATUO
. Use CSR format for the mask and for the output of multistage execution. For non-masked multiplication, or when the mask is dense, multiply matching formats (CSR times CSR or CSC times CSC). Use the same format for the mask and for multistage output, and use
MKL_GRAPH_METHOD_GUSTAVSON
or
MKL_GRAPH_METHOD_ATUO
. A “dense” mask means the mask has all possible entries present. It must still be stored in CSR or CSC format using
mkl_graph_matrix_set_csr
or
mkl_graph_matrix_set_csc
. For the purpose of these recommendations a transpose flag in the descriptor effectively switches the format, so a CSR input with a transpose flag set is equivalent to CSC when selecting a recommended configuration.
Input Parameters
M
A graph matrix which contains the mask. If NULL, no mask will be used.
Currently the mask is only allowed to have either boolean values or values of the same type as the input matrices (unless
MKL_GRAPH_MOD_ONLY_STRUCTURE
is set in the descriptor for the mask).
accum
Binary operator to be used as an accumulator. Refer to Graph API Glossary for a list of possible options.
Currently only
MKL_GRAPH_ACCUMULATOR_NONE
is supported by this routine.
semiring
Algebraic semiring. Refer to Graph API Glossary for a list of possible options.
Currently, the following semirings are supported for this routine for all configurations:
MKL_GRAPH_SEMIRING_PLUS_TIMES_INT32
,
MKL_GRAPH_SEMIRING_PLUS_TIMES_INT64
,
MKL_GRAPH_SEMIRING_PLUS_TIMES_FP32
MKL_GRAPH_SEMIRING_PLUS_FIRST_FP32
MKL_GRAPH_SEMIRING_PLUS_SECOND_FP32
The semirings
MKL_GRAPH_SEMIRING_PLUS_PAIR_INT32
and
MKL_GRAPH_SEMIRING_PLUS_PAIR_INT64
are supported with a sparse mask
M
.
The semiring
MKL_GRAPH_SEMIRING_ANY_PAIR_BOOL
is supported with no mask or a dense mask.
A
A graph matrix which contains the input matrix
A
.
B
A graph matrix which contains the input matrix
B
.
The types of indices and values of the matrices
A
and
B
should match (except for the values which can be unused if a corresponding flag
MKL_GRAPH_MOD_ONLY_STRUCTURE
is set in the descriptor for one of the matrices).
desc
An operation descriptor. Refer to Graph API Glossary for a list of possible options. If NULL, no extra modifiers are used for the operation.
Currently it is allowed to set:
MKL_GRAPH_MOD_ONLY_STRUCTURE
for any of the input matrices or the mask,
MKL_GRAPH_MOD_TRANSPOSE
for any of the input matrices,
MKL_GRAPH_MOD_KEEP_MASK_STRUCTURE
for the output.
request
An operation request as defined in the multi-stage execution model.
For single-stage execution, use
MKL_GRAPH_REQUEST_COMPUTE_ALL
. For multistage execution, stages
MKL_GRAPH_REQUEST_FILL_NNZ
and
MKL_GRAPH_REQUEST_FILL_ENTRIES
should be used.
Refer to Graph API Glossary for a list of possible options.
method
A method which should be used for computing the result. For an automatic choice, use
MKL_GRAPH_METHOD_AUTO
. For a dot-product based method (supported only for the masked case with a sparse mask), use
MKL_GRAPH_METHOD_DOT
. For a Gustavson algorithm (supported only for the non-masked case and dense masks), use
MKL_GRAPH_METHOD_GUSTAVSON
. A “dense” mask means the mask has all possible entries present. It must still be stored in CSR or CSC format using
mkl_graph_matrix_set_csr
or
mkl_graph_matrix_set_csc
. Refer to Graph API Glossary for a list of possible options.
Output Parameters
C
A graph matrix which contains the output matrix
C
. If matrix
C
is non-empty on entry to the routine, its data is overwritten by the result of the computations.
Return Values
The function returns a value indicating whether the operation was successful or not and why. Refer to Graph API Glossary for a list of possible options.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
This notice covers the following instruction sets: SSE2, SSE4.2, AVX2, AVX-512.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804