Substitution time consuming in Pardiso

Substitution time consuming in Pardiso

Dear All,

I have a reactive transport model that utilizing pardiso as the solver. The model is discretized by control volume method and solved by newton iteration method. I do sybmolic factorization at first, then do numerical factorization and substitution in every time step. The model can work but I do have a question on the speedup problem.

The parameter for pardiso are as follows:
iparm = 0
iparm(1) = 1 ! no solver default
iparm(2) = 3 ! fill-in reordering from METIS ,0-MIN DEGREE, 2-METIS, 3-OPENMP VERSION
iparm(3) = 0 ! numbers of processors. Input the next call mkl_set_dynamic(0), mkl_set_num_threads(n);
iparm(4) = 61 ! 0-no iterative-direct algorithm; 10*L+K, K=1 CGS, K=2 CGS for symmetric, 1.0E-L: stopping criterion
iparm(5) = 0 ! no user fill-in reducing permutation
iparm(6) = 0 ! if == 0, the array of b is replaced with the solution x.
iparm(7) = 0 ! Output, Number of iterative refinement steps performed
iparm(8) = 9 ! numbers of iterative refinement steps, must be 0 if a solution is calculated with separate substitutions (phase = 331, 332, 333)
iparm(9) = 0 ! not in use
iparm(10) = 13 ! Default value 13, perturbe the pivot elements with 1E-13
iparm(11) = 1 ! use nonsymmetric permutation and scaling MPS
iparm(12) = 0 ! not in use
iparm(13) = 1 ! maximum weighted matching algorithm is switched-on (default for non-symmetric)
iparm(14) = 0 ! Output: number of perturbed pivots
iparm(15) = 0 ! Output, Peak memory on symbolic factorization.
iparm(16) = 0 ! Output, Permanent memory on symbolic factorization. This value is only computed in phase 1.
iparm(17) = 0 ! Output, Size of factors/Peak memory on numerical factorization and solution.
iparm(18) = 0 ! Input/output. Report the number of non-zero elements in the factors. >= 0 Disable reporting.
iparm(19) = 0 ! Input/output. Report number of floating point operations to factor matrix A. >= 0 Disable reporting.
iparm(20) = 0 ! Output: Numbers of CG Iterations. >0 CGS succeeded, reports the number of completed iterations.
iparm(24) = 1 ! Parallel factorization control, 0: classic algorithm, 1: two-level factorization algorithm, improve scalability on many threads.
iparm(25) = 0 ! Parallel forward/backward solve control. 0: Use parallel algorithm for the solve step; 1: Use the sequential forward/backward solve.
iparm(27) = 0 !check matrix error, 0-without check, 1-check

maxfct = 1
mnum = 1
nrhs = 1
error = 0 ! initialize error flag
msglvl = 0 ! print statistical information
mtype = 11 ! real unsymmetric

First, I do symbolic factorization with phase = 11 (One time)
phase = 11   ! only reordering and symbolic factorization
call pardiso (pt_in, maxfct, mnum, mtype, phase, n_in, a_in, ia_in, ja_in, idum, nrhs, iparm, msglvl, ddum, ddum, error)
Then, in every time step, I do newton iteration. For every newton iteration, I do numerical factorization and substitution:
!Newton iteration
phase = 22   ! only factorization
call pardiso (pt_in, maxfct, mnum, mtype, phase, n_in, a_in, ia_in, ja_in, idum, nrhs, iparm_in, msglvl, ddum, ddum, error)
phase = 33   ! only substitution
call pardiso (pt_in, maxfct, mnum, mtype, phase, n_in, a_in, ia_in, ja_in, idum, nrhs, iparm, msglvl, b_in, x_out, error)
!Newton iteration

The matrix is 2573775*2573775 with nozero entry of 12810555.
When I use 4 processors, the time consuming is as follows:
TimeStep  NewtonIteration  NumericalFactorization Substitution
1         1                2.673447               0.253894
1         2                0.000024               0.614713
1         3                0.000023               0.595046
2         1                0.000023               0.619408
2         2                0.000023               0.613553
...
My question is: Why does substitution cost so much time after the first step? In common sense, time consuming of substitution is much less than numerical factorization.

System is WIN7 X64
Compiler: Intel Parallel Studio XE 2013

Thanks.

7 posts / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.

Hi,

I see that you set iparm(4) to nonzero balue. As result you Cholesky decomposition have been replaced to iterative algorithm with preconditioner calculated on first run. That is the reason of almost zero factorization time and significant solving time im your implementation.

With best regards,

Hi Alexander,

Thanks so much. I have another question on the problem of symbolic factorization. I tried to do symbolic factorization at first run and then in the following newton iterations, only do numerical factorization and substitution. For the simple problem, this works fine. But for the complex problem, things are quite different. The results are correct in the beginning, but after many iterations, the result is incorrect with large error. Then I tried to do symbolic factorization every step, it can generate correct results, but this significant reduce the efficient. Is there anything wrong?

Hi,

In nonsymmetric case you turn on matching and scaling to improve quality of factorization step. After whn you change matrix, you use reordering from initial matrix that have been reordering to improve factorization quality of that matrix. As results your reordering is not quite good and follows to bad results on factorization and solving. For nonsymmetric case better to call reorder step for each matrix.

With best regards,

Quote:

Hi,

In nonsymmetric case you turn on matching and scaling to improve quality of factorization step. After whn you change matrix, you use reordering from initial matrix that have been reordering to improve factorization quality of that matrix. As results your reordering is not quite good and follows to bad results on factorization and solving. For nonsymmetric case better to call reorder step for each matrix.

With best regards,

Hi Alexander,

Thanks so much.

Is there any alternative method to avoid symbolic factorization everytime or is there a criteria that can help me to judge when I need to call reorder step. I have tested on many simulations, for about half of the simulations, I can get good results with symbolic factorization only for the initial step, and for the others, I should do symbolic factorization every step to get correct results.

Thanks again and best regards,

Daniel

Hi Daniel,

It does depend on the problem. As i understood you solve nonlinear problem used PARDISO on each step to obtain solution of linear equations. The main problem of nonlinear problem - each one needs private investigation :) As for me, I call several factorization and solve step on one reorder step and made new reorder step in case when solution  of linear equation become poor - number of iterative of solving step increase or residual norm increase of something else. In case of convergent algorithm number of PARDISO call without new reordering step become bigger.

With best regards,

Quote:

Hi Daniel,

It does depend on the problem. As i understood you solve nonlinear problem used PARDISO on each step to obtain solution of linear equations. The main problem of nonlinear problem - each one needs private investigation :) As for me, I call several factorization and solve step on one reorder step and made new reorder step in case when solution  of linear equation become poor - number of iterative of solving step increase or residual norm increase of something else. In case of convergent algorithm number of PARDISO call without new reordering step become bigger.

With best regards,