PARDISO gives wrong answers when given too many rhs's

PARDISO gives wrong answers when given too many rhs's

When I call PARDISO with too many rhs's I get wrong answers.

I first call PARDISO for phase 11 and phase 22. I keep the factored matrix in memory, and, subsequently leave the subroutine that calls PARDISO.

I later call PARDISO to perform only phase 33. I send in a fully populated, "stacked" rhs vector (i.e. total number of entries is neq*nrhs).

Here's what I've found:

Case 1: When I call PARDISO for phase 33 only and all 9,850 rhs's, the PARDISO returns completely wrong answers. Many NaN's and all other completely wrong vales.

Also, after the call to PARDISO, I cannot successfully deallocate some arrays (these arrays are totally unrelated to the matrices used by PARDISO). The message is

"Invalid pointers" I suspect that PARDISO has overwritten some of the pointers (and/or values in these arrays that I cannot deallocate. Perhaps a memory leak????

Case 2: When I call PARDISO for phase 33 only, sending in 3,000 rhs's at time (i.e I call PARDISO 4 times (1st 3000, 2nd 3000, 3rd 3000, 4th 3000, and, then the remaining 850 rhs) I get wrong answers,

BUT, the answers are almost reasonable. There are no NaN's and without knowing I might guesss the answers are correct. In the Case 2, the deallocation problem, mentioned in case 1, does NOT occur.

Case 3: When I call PARDISO for phase 33 only, sending in 100 rhs's at a time (i.e. I call PARDISO 99 times) I get the correct answers for everything, and, there are no problems.

===============

What is causing this problem? I am concerned that other models (let's say I've got 500,000 equations) may provide wrong answers, and, I won't know it (i.e. I have no way

of knowing the largest number of rhs's that I can send into PARDISO in a single chunk, for a general model.

==============

I know that I'm not running out of central memory, and, I know that I'm not running out of swap space. Do I have a number of threads problem??

=====================

MKL: Version 10.3 Update 4

Operating System: Red Hat Enterprise Linux AS release 4 (Nahant Update 8)

Environment Variables set:
MKL_NUM_THREADS = 32

Computer has 32 processors

Computer has 198 Gb of memory

Computer has 68 Gb of available swap space

==================================================

I am using solution PARDISO mtype=6 (i.e. double precision complex, symmetric, in-core only)

Number of equations is 183,180

Needed number of rhs's = 9,850

==================================================

Any help would be GREATLY appreciated.

Thanks, Bob

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Bob,

would you please give us the separate test for reproducing this issue?

Gennady

Gennady,

I cannot create a separate test code.   However, I belive that I know the problem.  I believe that PARDISO needs to use 64-bit integer indexing for my matrix.

I'm currently using MKL version 10.3 update 4.  I am compiling with:  $MKLPATH/libmkl_intel_lp64.a  and $MKLPATH/libmkl_solver_lp64.a  and have included   $MKLINCLUDE/intel64/lp64, and am using -mcmodel=large

I have printed out the integer value iparm(18) that is returned from PARDISO immediately after phase 11.  It is a negative number for my matrix.  So, I'm guessing that I have an integer*4 versus integer*8 problem.  My question is how can I solve this problem with my current version of MKL.

I'm GUESSING that all I've got to do is change the $MKLPATH/libmkl_intel_lp64.a to $MKLPATH/libmkl_intel/ilp64.a  AND change the integer declarations in my PARDISO calling routine from integer*4 to integer*8.    Do you agee?

Do you think that I also need to change the libmkl_solver_lp64.a to libmkl_solver_ilp64.a  and change the $MKLINCLUDE/intel64/lp64 to $MKLINCLUDE/intel64/ilp64  ??

Thanks,  Bob

 

Bob, 

yes, you don't need to link explicitly with libmkl_solver_ilp64.a. it would be enough to link with libmkl_intel_lp64.a lib only. In the case if you will not prepare and give us the example, I would recommend you to try the latest 11.1 update 3. You can try the evaluation version for 30 days. We have fixed a number of similar issues since 10.3. 

--Gennady

 

Login to leave a comment.