I am having troubles running my Fortran MPI program on a cluster using PBSPro_220.127.116.11723.
When i execut my script (select information shown here)
#PBS -l select=1:ncpus=8
mpirun -n 8 -env I_MPI_DEBUG 5 ./xhpl_intel64
The scheduler allocates 8 cores for my program however if i ssh into the node and use top i can see that 4 mpiprocesses gets a core each and the last 4 processes shares a core. Thus providing very bad performance.
The wierd thing is that when using the Intel MPI library 4.0.0.028 runtime version this does not happen.
And it does not happen when executed from outside the batch queue
Using the Intel MPI library 4.0.1 and up this happens.
I notice that the < 4.0.1 runtime does not complain about
Setting I_MPI_PROCESS_MANAGER=mpd andadding machinefile $PBS_NODEFILE
will let the previously mentionen 8 core execution run close to 100% CPU however.
if i run 1 8 core job, then submits a 16 core job (thus using the first 8 cores on the same node as the first job and the next 8 nodes on another node)
and follows up with another 8 core job, then the last job and the last 8 cores og the 16 core job are placed on the same cores.