[itac + mpich] Invalid communicatorFatal error in MPI_Comm_dup

[itac + mpich] Invalid communicatorFatal error in MPI_Comm_dup

Hi,

I use ITAC (v7.2.1.008) and mpich2. I source the itac environment with the mpich flag (ie : source itac/7.2.1.008/bin/itacvars.sh mpich).

All the compilation is done by ifort and icc

I set LD_PRELOAD with LD_PRELOAD = $VT_SLIB_DIR/libVT.so

When I launch my app with mpirun I get :

Fatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(195): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff236f67a0) failed
MPI_Comm_dup(103): Invalid communicatorFatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(195): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff84faafa0) failed
MPI_Comm_dup(103): Invalid communicatorFatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(195): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff95c4e610) failed
MPI_Comm_dup(103): Invalid communicatorFatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(195): MPI_Comm_dup(comm=0x5b, new_comm=0x7fffd7d51a10) failed
MPI_Comm_dup(103): Invalid communicatorFatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(195): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff0054c160) failed
MPI_Comm_dup(103): Invalid communicatorFatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(195): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff400a8a80) failed
MPI_Comm_dup(103): Invalid communicatorFatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(195): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff684344f0) failed
(.............)

Without LD_PRELOAD, there is no error.
I have tested with intel mpi v.3, and no error too.

I have this problem only when I use mpich.

What is wrong ?

Thanks

Matthieu

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Mattieu,

Thanks for posting and welcome to the HPC forums.

It would be great to know how you build your application with libVT. For example, the Intel Trace Collector also requires that the following libraries are linked in in addition to libVT: -ldwarf -lelf -lvtunwind -lnsl -lm -ldl -lpthread. Is that true for you? If you're not comfortable providing that information in a public forum, feel free to submit an issue via the Intel Premier Support webpage.

It would also be great to try out a small Hello World application with LD_PRELOAD, mpich, and the Intel Trace Collector, and see if that works. I can provide one to you, or, if you have the Intel MPI Library installed, one is available in the /test directory.

Regards,
~Gergana

Quoting - Gergana Slavova (Intel)

Hi Mattieu,

Thanks for posting and welcome to the HPC forums.

It would be great to know how you build your application with libVT. For example, the Intel Trace Collector also requires that the following libraries are linked in in addition to libVT: -ldwarf -lelf -lvtunwind -lnsl -lm -ldl -lpthread. Is that true for you? If you're not comfortable providing that information in a public forum, feel free to submit an issue via the Intel Premier Support webpage.

It would also be great to try out a small Hello World application with LD_PRELOAD, mpich, and the Intel Trace Collector, and see if that works. I can provide one to you, or, if you have the Intel MPI Library installed, one is available in the /test directory.

Regards,
~Gergana

Thanks for your answer.

I have followed your suggestion : I use test.f90 from intel MPI test directory.

1. I source ifort, itac environment (I set mpich when I source itac) and mpich environment.

2. The compilation command :

ifort test.f90 -I/users/matt/mpich2install/include/ -L/users/matt/mpich2install/lib -L$VT_SLIB_DIR -lVT -ldwarf -lelf -lvtunwind -lnsl -lm -ldl -lpthread -lfmpich -lmpich -lrt -Wl,--allow-shlib-undefined -o test

(I have to put -Wl,--allow-shlib-undefined (according to itac doc). Without this linker option, I get
slib_mpich/libVT.so: undefined reference to `PMPIO_Wait'...)

3.

I run "test"
~/mpich2install/bin/mpirun -n 4 ./test

And I get same kind of error :

Fatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(167): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff1830d4b0) failed
MPI_Comm_dup(95).: Invalid communicator[cli_0]: aborting job:
Fatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(167): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff1830d4b0) failed
MPI_Comm_dup(95).: Invalid communicator
Fatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(167): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff30a9dc30) failed
MPI_Comm_dup(95).: Invalid communicatorFatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(167): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff6026b3b0) failed
MPI_Comm_dup(95).: Invalid communicator[cli_1]: aborting job:
Fatal error in MPI_Comm_dup: Invalid communicator, error stack:
MPI_Comm_dup(167): MPI_Comm_dup(comm=0x5b, new_comm=0x7fff6026b3b0) fail

Of course without lVT, the program runs correctly...

I have also tested with the LD_PRELOAD method, and static linking method, and I have always the same error...

My mpich version : mpich2 v.1.0.8
ifort v.11.1.038
itac : 7.2.1.008

Regards,

Matthieu

Quoting - matt.o
Hi,

I use ITAC (v7.2.1.008) and mpich2. I source the itac environment with the mpich flag (ie : source itac/7.2.1.008/bin/itacvars.sh mpich).

Hi Matthieu,

Could you try to use itacvars.sh mpich2.
mpich1 and mpich2 have absolutely different libraries.

Best wishes,
Dmitry

Leave a Comment

Please sign in to add a comment. Not a member? Join today