MPI_BARRIER not finishing

MPI_BARRIER not finishing

Hi.
I have a problem with a "fairly simple" program regarding a non finishing MPI_BARRIER using 7 nodes on 2 computers.
I have modified the test.f90 program supplied with MPI 4.0.2.005 to cause the problemand I have added the file as an attachment here.
The program is compiled through Visual Studio 2008 withIntel ComposerXE-2011 version 12.0.4.196 Build 20110427

The setup:
1 laptop 2 corerunning 1 mpi thread
1 computer 12 corerunning 6 mpi threads

executing:
mpiexec -hosts 2 host1 1 host 2 6 "MPI Test.exe"

Using the 2 hosts with their max number of cores or with only 1 core works.

From the fortran and the output can you give any hints as to why the program does not finish?

Best Regards
Jesper Carlson

AttachmentSize
Downloadtext/plain output.txt12 KB
Downloadapplication/octet-stream test.f902.92 KB
5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Jesper,

According to the Release Notes there is a requirement for a cluster to be homogeneous.
- The Intel MPI Library does not support heterogeneous clusters of mixed
architectures and/or operating environments.

Sometimes it's even difficult to run applications on a cluster with mixed OSes like: Windows XP and Windows Server, Windows 7 and Windows Server.
You are using absolutely different nodes and it may lead to unpredictable results. Of cause I cannot reproduce the issue on our servers.

You may try to set fast fabric explicitly (-env I_MPI_FABRICS shm:tcp). Also you may try to add '-env I_MPI_PLATFORM 0', but I cannot guarantee that it will work.
BTW to understand what's going wrong you can use I_MPI_DEBUG environment variable with levels from 5 to 1000. (-env I_MPI_DEBUG 5)

Regards!
Dmitry

Hi Dmitry,

Can you clarify this part:

According to the Release Notes there is a requirement for a cluster to be homogeneous.
- The Intel MPI Library does not support heterogeneous clusters of mixed
architectures and/or operating environments.

Does this mean all the machines have to have the same CPU model (e.g. i7), brand (Intel/AMD) or just that they all have to be x64-based ?

HI Dmitry

Ah yes i see the error in not using a homogeneous setup, in other situations we do indeed stick to identical setups.

Using '-env I_MPI_PLATFORM 0' actually did allow the program to finish correctly

I guess its a flag for remembering the requirements :-)
Thank you for your help and time

Jesper Carlson

Actually, with latest versions of the Intel MPI Library you can use '-env I_MPI_PLATFORM auto'. In this case the library itself tries to identify the best settings for the existing nodes. Start-up may be a bit slower but at least it's much better than SegFault or something like that.

Regards!
Dmitry

Leave a Comment

Please sign in to add a comment. Not a member? Join today