We got new nodes on our cluster. On the first 12 old nodes intel mpi (intel cluster studio 2010, 2011, 2012) works without any problems. The 4 new nodes are exactly the same OS than the 12 old and the same installation (node image). It is the same hardware too.
We have I_MPI_FABRICS=shm:ofa
If I start mpirun on the 12 old nodes, it works without problems.
If I try to start a parallel job with one of the new node I get:
send desc error  Abort: Got FATAL event 3 at line 861 in file ../../ofa_utility.c
If I try to start a local job on one of the new node, it works.
So It is linked with infiniband.
Strange, because a run with openmpi with infiniband works with the new nodes.
If I'm using I_MPI_FABRICS=shm:dapl with the new nodes it works.