Scalapack linear solver (memory problem)

Scalapack linear solver (memory problem)

Hi all,

I am trying to use Scalapack in order to solve a distributed linear system. The C++ source code is reported in attachment. The code compiles and runs without problems and it gives the right result. Then I tried to run the source code using valgrind (with environment variable set properly for an MPI app) in order to test memory management. Via Valgrind the execution crashes:

valgrind MPI wrappers  6704: Active for pid 6704
valgrind MPI wrappers  6704: Try MPIWRAP_DEBUG=help for possible options
valgrind MPI wrappers  6703: Active for pid 6703
valgrind MPI wrappers  6705: Active for pid 6705
valgrind MPI wrappers  6703: Try MPIWRAP_DEBUG=help for possible options
valgrind MPI wrappers  6705: Try MPIWRAP_DEBUG=help for possible options
valgrind MPI wrappers  6712: Active for pid 6712
valgrind MPI wrappers  6712: Try MPIWRAP_DEBUG=help for possible options
[giorgio-VirtualBox:6712] *** An error occurred in MPI_Type_get_envelope
[giorgio-VirtualBox:6712] *** on communicator MPI_COMM_WORLD
[giorgio-VirtualBox:6712] *** MPI_ERR_INTERN: internal error
[giorgio-VirtualBox:6712] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
mpirun has exited due to process rank 3 with PID 6712 on
node giorgio-VirtualBox exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[giorgio-VirtualBox:06702] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[giorgio-VirtualBox:06702] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

I started to test using valgrind because I encountered memory problems in a more complex example where linear system matrix and linear system known term were constructed using various distributed algebra functions (pdgetri_, pfgemm_, pdgemv_) and other auxiliar linear systems solutions.

What is wrong? Thank you in advance for your help.

Massi

Compiler and linker: mpic++

Includes: /usr/lib/opnempi/include /opt/intel/composer_xe_2011_sp1.7.256/mkl/include

Link line:  -L$(MKLROOT)/lib/intel64 -lmkl_scalapack_lp64 -lmkl_cdft_core -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lmkl_blacs_openmpi_lp64 -lpthread -lm 

Compiler options: -DMKL_LP64

Environment variables: LD_LIBRARY_PATH=/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64

 

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Added attachment

Attachments: 

AttachmentSize
Downloadtext/x-c++src scalapack.cpp2.2 KB

Hi, the Valgrind output seems to indicate the error occurred inside an MPI routine. Which vendor's MPI do you use. Is it Intel MPI?

Hi Zhang,

Thank you for your reply. I don't use Intel MPI but OpenMPI 1.4.3. I solved the memory problems on the more complex source code that I cited in the first post (It was my fault in the input of pdgetri_) but I can't still run the code using Valgrind without crash occurring.

Best regards,

Massi 

Massi,

Can you try linking with Intel MPI (you can download a 30-day trial version from http://software.intel.com/en-us/intel-mpi-library)? Also, try to run the program using only 1 MPI rank. Does it run OK with Valgrind? What I'm suspecting is, this is a problem of OpenMPI and has nothing to do with MKL. By the way, you are still using Intel Composer XE 2011, which was released more than 2 years ago. You may consider update to Intel Composer XE 2013. Or, at least, update your MKL to the latest 11.1 version. A lot of improvements and bug fixes have been done over the last 2 years.

 

 

Leave a Comment

Please sign in to add a comment. Not a member? Join today