I am helping someone to run a large, parallel, fortran, FEA code, on our cluster. I compiled his code and openmpi with Intel compilers (both version 12.0.3 and 11.0.083). The code works on 4 processes: either (1 node, 4 cores) or (4 nodes, 1 core). But it fails on 8 processes. The error we get is:
*** glibc detected *** /home1/david/DynaTest/src/dynaflow/dynaflow.v02_mpi_intel_64bits: double free or corruption (!prev): 0x0000000013fcf290 ***
*** glibc detected *** /home1/david/DynaTest/src/dynaflow/dynaflow.v02_mpi_intel_64bits: double free or corruption (!prev): 0x000000001d3a7bc0 ***
*** glibc detected *** /home1/david/DynaTest/src/dynaflow/dynaflow.v02_mpi_intel_64bits: munmap_chunk(): invalid pointer: 0x0000000013e75850 ***
I have attached the full error mesage in a text file.
I have tried different versions of openmpi: 1.3.2, 1.4.2 and 1.4.3. Same results.
On the other hand, compiling with PGI compilers (11.2-1) works fine on as many core as I want. The compilation options for both compilers are: -g -O0.
I have run the code with a debugger and it fails on a deallocate:
deallocate(jb,stat=ierr) WHERE IT FAILS
The function setup1 comes from a library for parallel sparse matrices that I could post.
We are running CentOS 5.5, kernel: 2.6.18-194.32.1.el5, gcc version 4.1.2 20080704. The infiniband network is from Qlogic with Open Fabrics Enterprise Distribution (OFED) version 1.5.2. The CPU is an Intel, 64-bit, Quad core (E5345 @ 2.33GHz).
Intel + openmpi works on other, simpler, codes.
I am running out of things to try so I am looking for any clues on how to make it work. Maybe some special compile options? There might be a problem with the code but it's hard for me to argue that since it works with PGI.