Developer Reference for Intel® oneAPI Math Kernel Library for C

ID 766684
Date 11/07/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

cluster_sparse_solver

Calculates the solution of a set of sparse linear equations with single or multiple right-hand sides.

Syntax

void cluster_sparse_solver (_MKL_DSS_HANDLE_t pt, const MKL_INT *maxfct, const MKL_INT *mnum, const MKL_INT *mtype, const MKL_INT *phase, const MKL_INT *n, const void *a, const MKL_INT *ia, const MKL_INT *ja, MKL_INT *perm, const MKL_INT *nrhs, MKL_INT *iparm, const MKL_INT *msglvl, void *b, void *x, const int *comm, MKL_INT *error);

Include Files

  • mkl_cluster_sparse_solver.h

Description

The routine cluster_sparse_solver calculates the solution of a set of sparse linear equations

A*X = B
with single or multiple right-hand sides, using a parallel LU, LDL, or LLT factorization, where A is an n-by-n matrix, and X and B are n-by-nrhs vectors or matrices.

NOTE:

This routine supports the Progress Routine feature. See Progress Function for details.

Input Parameters

NOTE:

Most of the input parameters (except for the pt, phase, and comm parameters and, for the distributed format, the a, ia, and ja arrays) must be set on the master MPI process only, and ignored on other processes. Other MPI processes get all required data from the master MPI process using the MPI communicator, comm.

pt

Array of size 64.

Handle to internal data structure. The entries must be set to zero before the first call to cluster_sparse_solver.

CAUTION:

After the first call to cluster_sparse_solver do not modify pt, as that could cause a serious memory leak.

maxfct

Ignored; assumed equal to 1.

mnum

Ignored; assumed equal to 1.

mtype

Defines the matrix type, which influences the pivoting method. The Parallel Direct Sparse Solver for Clusters solver supports the following matrices:

1

real and structurally symmetric

2

real and symmetric positive definite

-2

real and symmetric indefinite

3

complex and structurally symmetric

4

complex and Hermitian positive definite

-4

complex and Hermitian indefinite

6

complex and symmetric

11

real and nonsymmetric

13

complex and nonsymmetric

phase

Controls the execution of the solver. Usually it is a two- or three-digit integer. The first digit indicates the starting phase of execution and the second digit indicates the ending phase. Parallel Direct Sparse Solver for Clusters has the following phases of execution:

  • Phase 1: Fill-reduction analysis and symbolic factorization

  • Phase 2: Numerical factorization

  • Phase 3: Forward and Backward solve including optional iterative refinement

  • Memory release (phase= -1)

If a previous call to the routine has computed information from previous phases, execution may start at any phase. The phase parameter can have the following values:

phase
Solver Execution Steps
11

Analysis

12

Analysis, numerical factorization

13

Analysis, numerical factorization, solve, iterative refinement

22

Numerical factorization

23

Numerical factorization, solve, iterative refinement

33

Solve, iterative refinement

-1

Release all internal memory for all matrices

n

Number of equations in the sparse linear systems of equations A*X = B. Constraint: n > 0.

a

Array. Contains the non-zero elements of the coefficient matrix A corresponding to the indices in ja. The coefficient matrix can be either real or complex. The matrix must be stored in the three-array variant of the compressed sparse row (CSR3) or in the three-array variant of the block compressed sparse row (BSR3) format, and the matrix must be stored with increasing values of ja for each row.

For CSR3 format, the size of a is the same as that of ja. Refer to the values array description in Three Array Variation of CSR Format for more details.

For BSR3 format the size of a is the size of ja multiplied by the square of the block size. Refer to the values array description in Three Array Variation of BSR Format for more details.

NOTE:

For centralized input (iparm[39]=0), provide the a array for the master MPI process only. For distributed assembled input (iparm[39]=1 or iparm[39]=2), provide it for all MPI processes.

IMPORTANT:

The column indices of non-zero elements of each row of the matrix A must be stored in increasing order.

ia

For CSR3 format, ia[i] (i<n) points to the first column index of row i in the array ja. That is, ia[i] gives the index of the element in array a that contains the first non-zero element from row i of A. The last element ia[n] is taken to be equal to the number of non-zero elements in A, plus one. Refer to rowIndex array description in Three Array Variation of CSR Format for more details.

For BSR3 format, ia[i] (i<n) points to the first column index of row i in the array ja. That is, ia[i] gives the index of the element in array a that contains the first non-zero block from row i of A. The last element ia[n] is taken to be equal to the number of non-zero blcoks in A, plus one. Refer to rowIndex array description in Three Array Variation of BSR Format for more details.

The array ia is accessed in all phases of the solution process.

Indexing of ia is one-based by default, but it can be changed to zero-based by setting the appropriate value to the parameter iparm[34]. For zero-based indexing, the last element ia[n] is assumed to be equal to the number of non-zero elements in matrix A.

NOTE:

For centralized input (iparm[39]=0), provide the ia array at the master MPI process only. For distributed assembled input (iparm[39]=1 or iparm[39]=2), provide it at all MPI processes.

ja

For CSR3 format, array ja contains column indices of the sparse matrix A. It is important that the indices are in increasing order per row. For symmetric matrices, the solver needs only the upper triangular part of the system as is shown for columns array in Three Array Variation of CSR Format.

For BSR3 format, array ja contains column indices of the sparse matrix A. It is important that the indices are in increasing order per row. For symmetric matrices, the solver needs only the upper triangular part of the system as is shown for columns array in Three Array Variation of BSR Format.

The array ja is accessed in all phases of the solution process.

Indexing of ja is one-based by default, but it can be changed to zero-based by setting the appropriate value to the parameter iparm(35).

NOTE:

For centralized input (iparm(40)=0), provide the ja array at the master MPI process only. For distributed assembled input (iparm(40)=1 or iparm(40)=2), provide it at all MPI processes.

perm

Ignored.

nrhs

Number of right-hand sides that need to be solved for.

iparm

Array, size 64. This array is used to pass various parameters to Parallel Direct Sparse Solver for Clusters Interface and to return some useful information after execution of the solver.

See cluster_sparse_solver iparm Parameter for more details about the iparm parameters.

msglvl

Message level information. If msglvl = 0 then cluster_sparse_solver generates no output, if msglvl = 1 the solver prints statistical information to the screen.

Statistics include information such as the number of non-zero elements in L-factor and the timing for each phase.

Set msglvl = 1 if you report a problem with the solver, since the additional information provided can facilitate a solution.

b

Array, size n*nrhs. On entry, contains the right-hand side vector/matrix B, which is placed in memory contiguously. The b[i+k*n] must hold the i-th component of k-th right-hand side vector. Note that b is only accessed in the solution phase.

comm

MPI communicator. The solver uses the Fortran MPI communicator internally. Convert the MPI communicator to Fortran using the MPI_Comm_c2f() function. See the examples in the <install_dir>/examples directory.

Output Parameters

pt

Handle to internal data structure.

perm

Ignored.

iparm

On output, some iparm values report information such as the numbers of non-zero elements in the factors.

See cluster_sparse_solver iparm Parameter for more details about the iparm parameters.

b

On output, the array is replaced with the solution if iparm[5] = 1.

x

Array, size (n*nrhs). If iparm[5]=0 it contains solution vector/matrix X, which is placed contiguously in memory. The x[i+k*n] element must hold the i-th component of the k-th solution vector. Note that x is only accessed in the solution phase.

error

The error indicator according to the below table:

error
Information
0

no error

-1

input inconsistent

-2

not enough memory

-3

reordering problem

-4

Zero pivot, numerical factorization or iterative refinement problem. If the error appears during the solution phase, try to change the pivoting perturbation (iparm[9]) and also increase the number of iterative refinement steps. If it does not help, consider changing the scaling, matching and pivoting options (iparm[10], iparm[12], iparm[20])

-5

unclassified (internal) error

-6

reordering failed (matrix types 11 and 13 only)

-7

diagonal matrix is singular

-8

32-bit integer overflow problem

-9

not enough memory for OOC

-10

error opening OOC files

-11

read/write error with OOC files