MPI_BARRIER not finishing

MPI_BARRIER not finishing

Hi.
I have a problem with a "fairly simple" program regarding a non finishing MPI_BARRIER using 7 nodes on 2 computers.
I have modified the test.f90 program supplied with MPI 4.0.2.005 to cause the problemand I have added the file as an attachment here.
The program is compiled through Visual Studio 2008 withIntel ComposerXE-2011 version 12.0.4.196 Build 20110427

The setup:
1 laptop 2 corerunning 1 mpi thread
1 computer 12 corerunning 6 mpi threads

executing:
mpiexec -hosts 2 host1 1 host 2 6 "MPI Test.exe"

Using the 2 hosts with their max number of cores or with only 1 core works.

From the fortran and the output can you give any hints as to why the program does not finish?

Best Regards
Jesper Carlson

AdjuntoTamaño
Descargar output.txt12 KB
Descargar test.f902.92 KB
publicaciones de 5 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.

Hi Jesper,

According to the Release Notes there is a requirement for a cluster to be homogeneous.
- The Intel MPI Library does not support heterogeneous clusters of mixed
architectures and/or operating environments.

Sometimes it's even difficult to run applications on a cluster with mixed OSes like: Windows XP and Windows Server, Windows 7 and Windows Server.
You are using absolutely different nodes and it may lead to unpredictable results. Of cause I cannot reproduce the issue on our servers.

You may try to set fast fabric explicitly (-env I_MPI_FABRICS shm:tcp). Also you may try to add '-env I_MPI_PLATFORM 0', but I cannot guarantee that it will work.
BTW to understand what's going wrong you can use I_MPI_DEBUG environment variable with levels from 5 to 1000. (-env I_MPI_DEBUG 5)

Regards!
Dmitry

Hi Dmitry,

Can you clarify this part:

According to the Release Notes there is a requirement for a cluster to be homogeneous.
- The Intel MPI Library does not support heterogeneous clusters of mixed
architectures and/or operating environments.

Does this mean all the machines have to have the same CPU model (e.g. i7), brand (Intel/AMD) or just that they all have to be x64-based ?

HI Dmitry

Ah yes i see the error in not using a homogeneous setup, in other situations we do indeed stick to identical setups.

Using '-env I_MPI_PLATFORM 0' actually did allow the program to finish correctly

I guess its a flag for remembering the requirements :-)
Thank you for your help and time

Jesper Carlson

Actually, with latest versions of the Intel MPI Library you can use '-env I_MPI_PLATFORM auto'. In this case the library itself tries to identify the best settings for the existing nodes. Start-up may be a bit slower but at least it's much better than SegFault or something like that.

Regards!
Dmitry

Inicie sesión para dejar un comentario.