MPI startup fail

MPI startup fail

hello 

I am running a MPI program which involves 5 ranks communicating with each other. The program runs fine when i run it on Xeon platform but when i put one of the process rank on MIC the program crashes giving me a segmentation fault

mpi version: intel Mpi, 4.1.1.036

This is how i am setting up the libraries and compiler settings for mic

export LD_LIBRARY_PATH=/opt/intel/composerxe/lib/mic:$LD_LIBRARY_PATH

export I_MPI_MIC=1

scp /opt/intel/impi/4.1.1.036/mic/bin/pmi_proxy mic0:/bin
scp /opt/intel/impi/4.1.1.036/mic/lib/* mic0:/lib64/.

and then i use the below run command:-

mpirun -genv I_MPI_DEBUG 5 -host localhost -n 1 FILE1 : -host mic0 -n 1 ~/FILE2 : -host localhost -n 1 FILE3 : -host localhost -n 1 FILE4: -host localhost -n 1 FILE5

This is the error/ MPI debug message i get

[0] MPI startup(): cannot open dynamic library libdat2.so.2
[4] MPI startup(): cannot open dynamic library libdat2.so.2
[2] MPI startup(): cannot open dynamic library libdat2.so.2
[3] MPI startup(): cannot open dynamic library libdat2.so.2
[4] MPI startup(): cannot open dynamic library libdat2.so
[0] MPI startup(): cannot open dynamic library libdat2.so
[2] MPI startup(): cannot open dynamic library libdat2.so
[0] MPI startup(): cannot open dynamic library libdat.so.1
[4] MPI startup(): cannot open dynamic library libdat.so.1
[3] MPI startup(): cannot open dynamic library libdat2.so
[2] MPI startup(): cannot open dynamic library libdat.so.1
[4] MPI startup(): cannot open dynamic library libdat.so
[0] MPI startup(): cannot open dynamic library libdat.so
[2] MPI startup(): cannot open dynamic library libdat.so
[3] MPI startup(): cannot open dynamic library libdat.so.1
[4] ERROR - load_iblibrary(): Can't open IB verbs library: libibverbs.so: cannot open shared object file: No such file or directory

[2] ERROR - load_iblibrary(): Can't open IB verbs library: libibverbs.so: cannot open shared object file: No such file or directory
[0] ERROR - load_iblibrary(): Can't open IB verbs library: libibverbs.so: cannot open shared object file: No such file or directory

[3] MPI startup(): cannot open dynamic library libdat.so
[3] ERROR - load_iblibrary(): Can't open IB verbs library: libibverbs.so: cannot open shared object file: No such file or directory

[1] MPI startup(): cannot open dynamic library libdat2.so.2
[1] MPI startup(): cannot open dynamic library libdat2.so
[1] MPI startup(): cannot open dynamic library libdat.so.1
[1] MPI startup(): cannot open dynamic library libdat.so
[1] ERROR - load_iblibrary(): Can't open IB verbs library: libibverbs.so: cannot open shared object file: No such file or directory

[3] MPI startup(): shm and tcp data transfer modes
[4] MPI startup(): shm and tcp data transfer modes
[2] MPI startup(): shm and tcp data transfer modes
[0] MPI startup(): shm and tcp data transfer modes
[1] MPI startup(): shm and tcp data transfer modes

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi, 

Have you transferred the listed library files (libdat.so, libdat2.so.2, ... )  to coprocessor?  Also, I think you will need to set the LD_LIBRARY_PATH on the coprocessor to point to these files. 

 

Hi Vikrant,

Just curious, have you tried to download and run your MPI program directly in mic0 instead from host?

hi Ioc-nguyen and Sumedh,

thanks for the inputs 

The program runs fine now, the error was due to the code itself, I changed the rank of the processes in the mpirun command but did not update it in the mpi_send and mpi_recv command

 

Login to leave a comment.