Two-stage Algorithm for Inspector-executor Sparse BLAS routines
You can use a two-stage algorithm for Inspector-executor Sparse BLAS routines which produce a sparse matrix. The applicable routines are:
In the two-stage algorithm:
- The first stage constructs the structure of the output matrix.
- The second stage constructs other arrays and performs the desired operation.
You can separate the calls for each stage. You can also perform the entire computation in a single call using the
In the first stage, the algorithm computes only the row (CSR/BSR format) or column (CSC format) pointer array of the matrix storage format. The computed number of non-zeroes in the output matrix helps to calculate the amount of memory required.
In the second stage, the algorithm computes the remaining column (CSR/BSR format) or row (CSC format) index and value arrays for the output matrix. Use this value only after calling the function with SPARSE_STAGE_NNZ_COUNT first.
Combine the two stages by performing the entire computation in a single step.
This example uses the two-stage algorithm for mkl_sparse_sp2m routine with a matrix in CSR format:
First stage (
- The algorithm calls the mkl_sparse_sp2m routine with the request parameter set toSPARSE_STAGE_NNZ_COUNT.
- The algorithm exports the computedrows_startandrows_endarrays using the mkl_sparse_x_export_csr routine.
- These arrays are used to calculate the number of non-zeroes (nnz) of the resulting output matrix.
Note that at this stage, the arrays related to column index and values for the output matrix have not been computed.
status = mkl_sparse_sp2m ( opA, descrA, csrA, opB, descrB, csrB, SPARSE_STAGE_NNZ_COUNT, &csrC); /* optional calculation of nnz of resulting output matrix for computing memory requirement */ status = mkl_sparse_x_export_csr ( csrC, &indexing, &rows, &cols, &rows_start, &rows_end, &col_indx, &values); MKL_INT nnz = rows_end[rows-1] - rows_start;
Second stage (
The algorithm computes the remaining storage arrays (related to column index and values for the output matrix) and performs the desired operation.
status = mkl_sparse_sp2m ( opA, descrA, csrA, opB, descrB, csrB, SPARSE_STAGE_FINALIZE_MULT, &csrC);
Alternatively, you can perform both operations in a single step:
Single stage operation (
status = mkl_sparse_sp2m ( opA, descrA, csrA, opB, descrB, csrB, SPARSE_STAGE_FULL_MULT, &csrC);