I started a thread on this topic on Oct 25th, 2011 but I am unable to see that thread now on this forum. I searched the forum but couldn't find it anywhere. I received two replies to my post in my email account as given below:
Hi Nikhil,Well, this is a really specific case - it would be nice if
you could explain why you cannot use shared memory. Might be we need to
fix this issue instead of performance degradation with ofa.OFA fabric
has its own settings and they were tuned
Have you tried running the mpitune utility on your scenario?Although
shared memory is the best choice for your setup, the tool may help you
to identify MPI parameters that need to be modified as you are changing
the usual assumptions regarding environment
My answers to these replies:
1. (reply 1): The two processes which are launched on the node (say control node) get dispatched to different nodes (say compute node) for actual execution (i.e. it's virtual execution on control node). There are two scenarios here:
(a) When both of the processes are dispatched to the same compute node, both "shm" or "ofa" work. But as I mentioned earlier, "ofa" runs quite slow.
(b) When these two processes are dispatched to two different compute nodes (one process to each compute node), then only "ofa" would work. And I get the same performance numbers as in (a) with "ofa" value.
2. (reply 2): I tried mpitune but it returns error as below:
27'Oct'11 18:37:33 WRN | Invalid default value ('/home/vertex/config.xml') of argument ('config-file').
27'Oct'11 18:37:33 CER | Invalid default value ('/home/vertex/options.xml') of argument ('options-file').
27'Oct'11 18:37:33 CER | A critical error has occurred!
Type : exceptions.Exception
Value : Invalid default value ('/home/vertex/options.xml') of argument ('options-file').
Why doe sit ask for default flile ?
I am using Intel MPI version 4.0.2.003 on CentOS 5.6 64
bit platform. I am using
IMB-MPI1 (pallas) benchmark on this platform. I have set
the I_MPI_FABRICS=ofa (in other words I need to force the use of OFED for
communication between MPI processes).
When I run as: "mpiexec -n 2 IMB-MPI1" it
launches two processes on the node.
For some particular reason specific to my environment I
can not use shared memory for I_MPI_FABRICS. Though the IMB-MPI1 benchmark
suite runs fine, I am receiving almost 40% less performance numbers when I run
the same suite using OpenMPI (without shared memory). Of course when I use
I_MPI_FABRICS=shm when the processes are executing on the same node, I get very
high performance numbers.
My question is: Is there a "loopback" mode in
Intel MPI which I can try for the processes running on the same node ? Or is
there any specific tuning parameter that I can use ?