Intel® Cluster Studio


I have two fortran mpi programs (driver.f90 and hello.f90, both attached here) 

driver.f90 contains a call to MPI_COMM_SPAWN which calls hello.x

When I run it using  the command "mpirun -np 2 ./driver.x" It crashes (output below this message). I noticed that spawned task 0 has a different parent communicator than the other tasks. I imagine that is the cause of the segmentation fault. It seems a very simple mpi program, both OpenMPI and MPICH work fine. Does anybody know what the problem might be

I'm using impi/ and ifort 15.0.3.



nested mpirun commands


   I have an mpi program that calls another mpi program (written by someone else) using a fortran system call



call MPI_INIT(..)

system('mpirun -np 2 ./mpi_prog.x')

call MPI_FINALIZE(...)


When I run it (e.g. mpirun -np 4 ./driver.x) it crashes inside mpi_prog.x (at a barrier). When I build it using mpich it works fine though. Any hints on what might be wrong (I realize nested mpirun's are completely beyond the mpi standard and highly dependent on implementation)


NOTE: when I do something like:

Error while building NAS benchmarks using Intel MPI

I am trying to build NAS benchmarks using Intel MPI and below is the makefile that I am using.









Problem with Intel MPI on >1023 processes

I have been testing code using Intel MPI (version 4.1.3  build 20140226) and the Intel compiler (version 15.0.1 build 20141023) with 1024 or more total processes. When we attempt to run on 1024 or more processes we receive the following error: 

MPI startup(): ofa fabric is not available and fallback fabric is not enabled 

Anything less than 1024 processes does not produce this error, and I also do not receive this error with 1024 processes using OpenMPI and GCC.

Problems with Intel MPI

I have trouble with running Intel MPI on cluster with different different numbers of processors on nodes (12 and 32).

I use Intel MPI 4.0.3 and it works correctly on 20 nodes with 12 processors (Intel(Xeon(R)CPU X5650 @2.67)) at each, and all processors works correctly, then I try to run Intel MPI on other 3 nodes with 32 processors (Intel(Xeon(R)CPU E5-4620 v2@2.00) at each and they work correctly too.

Mapping ranks consecutively on nodes


   Running Intel MPI 4.1.3

   Contrary to the user guide, which states for the default round-robin mapping,

To change this default behavior, set the number of processes per host by using the -perhost option, and set the total number of processes by using the -n option. See Local Options for details. The first <# of processes> indicated by the -perhost option is executed on the first host; the next <# of processes> is executed on the next host, and so on.

Intel® Cluster Studio abonnieren