mkl_graph_mxm
mkl_graph_mxm
Computes a (masked) graph matrix-matrix product.
Syntax
mkl_graph_status_t
mkl_graph_mxm (mkl_graph_matrix_t
C
,
mkl_graph_matrix_t
M
,
mkl_graph_accumulator_t
accum
,
mkl_graph_semiring_t
semiring
,
mkl_graph_matrix_t
A
,
mkl_graph_matrix_t
B
,
mkl_graph_descriptor_t
desc
,
mkl_graph_request_t
request
,
mkl_graph_method_t
method
);
Include Files
- mkl_graph.h
Description
The
mkl_graph_mxm
routine computes a (masked) graph matrix-matrix product defined as C<M> := accum(C, opA(A)*opB(B))
where
A
, B
and C
are input and output matrices respectively, accum
is an optional binary operator to be used as an accumulator for the output matrix, and M
is an optional mask for the output matrix. Possible modifications of the operations, including matrix modifiers opA
and opB
are defined through the operation descriptor desc
. The routine supports both single-stage and multistage execution modes via the
request
parameter. See Graph Operations for a description of the two modes. You can specify a specific method to be used for computations via the method
parameter. Use MKL_GRAPH_METHOD_AUTO
for an automatic choice.The operation can be done in-place meaning that the output matrix can be aliased with any of the input matrices or the mask. However, in this case the original data will be replaced.
In the masked case, by default, the pattern of the output matrix (entries present) will be the intersection of the pattern of the mask and the pattern of the non-masked product of matrices
A
and B
. If MKL_GRAPH_MODIFIER_OUTPUT
is set to MKL_GRAPH_KEEP_MASK_STRUCTURE
in the descriptor desc
, the pattern of the output matrix will be exactly the pattern of the mask. The edges missing in the non-masked product are given value zero in the output data type.For user control of output storage allocation, use multistage execution.
For maximum performance, the following configurations are recommended. For masked multiplication with a sparse mask, multiply CSR times CSC using
MKL_GRAPH_METHOD_DOT
or MKL_GRAPH_METHOD_ATUO
. Use CSR format for the mask and for the output of multistage execution. For non-masked multiplication, or when the mask is dense, multiply matching formats (CSR times CSR or CSC times CSC). Use the same format for the mask and for multistage output, and use MKL_GRAPH_METHOD_GUSTAVSON
or MKL_GRAPH_METHOD_ATUO
. A “dense” mask means the mask has all possible entries present. It must still be stored in CSR or CSC format using mkl_graph_matrix_set_csr
or mkl_graph_matrix_set_csc
. For the purpose of these recommendations a transpose flag in the descriptor effectively switches the format, so a CSR input with a transpose flag set is equivalent to CSC when selecting a recommended configuration.Input Parameters
- M
- A graph matrix which contains the mask. If NULL, no mask will be used.Currently the mask is only allowed to have either boolean values or values of the same type as the input matrices (unlessMKL_GRAPH_MOD_ONLY_STRUCTUREis set in the descriptor for the mask).
- accum
- Binary operator to be used as an accumulator. Refer to Graph API Glossary for a list of possible options.Currently onlyMKL_GRAPH_ACCUMULATOR_NONEis supported by this routine.
- semiring
- Algebraic semiring. Refer to Graph API Glossary for a list of possible options.Currently, the following semirings are supported for this routine for all configurations:MKL_GRAPH_SEMIRING_PLUS_TIMES_INT32,MKL_GRAPH_SEMIRING_PLUS_TIMES_INT64,MKL_GRAPH_SEMIRING_PLUS_TIMES_FP32MKL_GRAPH_SEMIRING_PLUS_FIRST_FP32MKL_GRAPH_SEMIRING_PLUS_SECOND_FP32The semiringsMKL_GRAPH_SEMIRING_PLUS_PAIR_INT32andMKL_GRAPH_SEMIRING_PLUS_PAIR_INT64are supported with a sparse maskM.The semiringMKL_GRAPH_SEMIRING_ANY_PAIR_BOOLis supported with no mask or a dense mask.
- A
- A graph matrix which contains the input matrixA.
- B
- A graph matrix which contains the input matrixB.The types of indices and values of the matricesAandBshould match (except for the values which can be unused if a corresponding flagMKL_GRAPH_MOD_ONLY_STRUCTUREis set in the descriptor for one of the matrices).
- desc
- An operation descriptor. Refer to Graph API Glossary for a list of possible options. If NULL, no extra modifiers are used for the operation.Currently it is allowed to set:MKL_GRAPH_MOD_ONLY_STRUCTUREfor any of the input matrices or the mask,MKL_GRAPH_MOD_TRANSPOSEfor any of the input matrices,MKL_GRAPH_MOD_KEEP_MASK_STRUCTUREfor the output.
- request
- An operation request as defined in the multi-stage execution model.For single-stage execution, useRefer to Graph API Glossary for a list of possible options.MKL_GRAPH_REQUEST_COMPUTE_ALL. For multistage execution, stagesMKL_GRAPH_REQUEST_FILL_NNZandMKL_GRAPH_REQUEST_FILL_ENTRIESshould be used.
- method
- A method which should be used for computing the result. For an automatic choice, useMKL_GRAPH_METHOD_AUTO. For a dot-product based method (supported only for the masked case with a sparse mask), useMKL_GRAPH_METHOD_DOT. For a Gustavson algorithm (supported only for the non-masked case and dense masks), useMKL_GRAPH_METHOD_GUSTAVSON. A “dense” mask means the mask has all possible entries present. It must still be stored in CSR or CSC format usingmkl_graph_matrix_set_csrormkl_graph_matrix_set_csc. Refer to Graph API Glossary for a list of possible options.
Output Parameters
- C
- A graph matrix which contains the output matrixC. If matrixCis non-empty on entry to the routine, its data is overwritten by the result of the computations.
Return Values
The function returns a value indicating whether the operation was successful or not and why. Refer to Graph API Glossary for a list of possible options.
Optimization Notice
|
---|
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
|
This notice covers the following instruction sets: SSE2, SSE4.2, AVX2, AVX-512.