mpdboot problem

mpdboot problem

Hi all,

I'm using Intel-MPI3 (icc & ifort 10 compilers) on a two node cluster with Ethernet interconnect.

The mpdboot command:

# mpdboot --totalnum=2 --file=/root/mpd.hosts --mpd=/opt/MPI_LIBS/INTEL-MPI/bin64/mpd --verbose --ncpus=4 --ifhn=10a0101

gave following error:

running mpdallexit on 10a0101
LAUNCHED mpd on 10a0101 via
RUNNING: mpd on 10a0101
LAUNCHED mpd on compute-0-0 via 10a0101
mpdboot_10a0101 (handle_mpd_output 589): from mpd on compute-0-0, invalid port info:
connect to address Connection refused
connect to address Connection refused
trying normal rsh (/usr/bin/rsh)

If --rsh=/usr/bin/ssh option is used, mpdboot works fine. But again gives error during a job submission across 2 nodes.

With MPICH2, mpdboot and the job submission are working without any error.

I'm not getting why its not happening with Intel MPI.

Can someone help me out to resolve this issue?

- Sanagmesh

5 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

Hi Sanagmesh,

It looks like a known bug. I belive that it should not appear in the latest release.

Package ID: l_mpi_p_3.1.026

Could you clarify the package ID for the Intel MPI Library you have? Itcan be found in the mpisupport.txt file. Would it be possible for you to do an upgrade if you have an older version?

Best regards, Andrey

I'm using:
Package ID: l_mpi_p_3.0.043

Is it happen in every cluster, if booted on >1 node?


Is it acceptable for you to do an upgrade to Intel MPI Library 3.1? If not so I would suggest you request a patch for "invalid port info" issueat As far as I know it is available for 3.0.043 package

I upgraded the Intel MPI to 3.1 version. Now I can mpdboot without any errors.



Deixar um comentário

Faça login para adicionar um comentário. Não é membro? Inscreva-se hoje mesmo!