Speedup of PARDISO

Speedup of PARDISO

Hello,

I'm trying the PARDISO solver in MKL v9.0 on a
shared memory SGI Altix machine with Itanium 2 cpus. My matrix is
symmetric positive definite, with sizes up to 350000 x 350000 and I
tried setting OMP_NUM_THREADS up to 12 cpus. There is a batch system,
so there is definitely no one else using the cpus I'm running on.
However, as I increase the number of processors, the performance
degrades, i.e. the fastest execution is with one cpu. Other parts of
the code scale well with OpenMP, except the PARDISO calls (all phases).

I
use -O3 -openmp -mtune=itanium2 as compiler flags and link the
following libraries: -lstdc++ -lmkl_solver -lmkl_lapack -lmkl_ipf
-lmkl_lapack64 -lmkl -lvml -lguide -lpthread.

I have in mkl.cfg:
MKL_SERIAL = OMP
MKL_INPUT_CHECK = OFF

Is there anything fundamental I'm missing or not doing right? Could anyone point me to the right direction?

Thank you in advance,
Jozsef

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.