I_MPI_PERHOST variable not working with IntelMPI v4.1.0.030

I_MPI_PERHOST variable not working with IntelMPI v4.1.0.030

Setting the I_MPI_PERHOST environment does not produce the expected behavior with codes built with  IntelMPI v4.1.0.030, while codes built with  IntelMPI v4.1.0.024 do.  See below for a description of the problem.  System OS is REDHAT linux v6.3.

Normal
0

false
false
false

EN-US
X-NONE
X-NONE

/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:"Times New Roman";
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;}

If 16 MPI processes are to be placed on each node and I_MPI_PERHOST is set to 8, then the first 8 processes

should be placed on the first node, the second 8 processes should be placed on the

second node, etc., until all the assigned nodes have 8 processes, and then the next

8 processes are placed on the first node, etc.

The job has the "select" line:

#PBS -l select=4:ncpus=16:mpiprocs=16

so that 16 MPI processes are to be placed on each of four nodes. The job uses the

latest IMPI module "mpi/intelmpi/4.1.0.030". The job also sets:

export I_MPI_PERHOST=8

But instead of placing 8 MPI processes on each node and then cycling back in

round-robin mode, the first 16 processes are placed on the first node, the next 16

processes are placed on the second node, etc., as if I had not set the I_MPI_PERHOST

environment variable. A portion of the output is given below.

mpirun = .../compiler/intelmpi/4.1.0.030/bin64/mpirun

Nodes used:

n0006

n0028

n0008

n0011

Rank Processor Name

0 n0006

1 n0006

2 n0006

3 n0006

4 n0006

5 n0006

6 n0006

7 n0006

8 n0006

9 n0006

10 n0006

11 n0006

12 n0006

13 n0006

14 n0006

15 n0006

16 n0028

17 n0028

18 n0028

19 n0028

20 n0028

21 n0028

22 n0028

23 n0028

24 n0028

25 n0028

26 n0028

27 n0028

28 n0028

29 n0028

30 n0028

31 n0028

32 n0008

33 n0008

34 n0008

35 n0008

36 n0008

37 n0008

38 n0008

39 n0008

40 n0008

41 n0008

42 n0008

43 n0008

44 n0008

45 n0008

46 n0008

47 n0008

48 n0011

49 n0011

50 n0011

51 n0011

52 n0011

53 n0011

54 n0011

5 n0011

56 n0011

57 n0011

58 n0011

59 n0011

60 n0011

61 n0011

62 n0011

63 n0011

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi George,

What happens if you use the latest version of the Intel® MPI Library, Version 4.1 Update 1?  If this is still showing the problem, please send the output with I_MPI_HYDRA_DEBUG=1.  This will generate a lot of output, so please capture it in a file and attach it to your reply.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Was the problem resolved? I'm also seeing similar problem. We have IntelMPI v.4.1.0.024  and v.5.0.1.035. With the older version mpirun option -perhost works as expected, but it doesn't work with the newer version:

$ qsub -I -lnodes=2:ppn=16:compute,walltime=0:15:00
qsub: waiting for job 5731.hpc-class.its.iastate.edu to start
qsub: job 5731.hpc-class.its.iastate.edu ready

$ mpirun -n 2 -perhost 1 uname -n
hpc-class-40.its.iastate.edu
hpc-class-40.its.iastate.edu

$ export I_MPI_ROOT=/shared/intel//impi/4.1.0.024
$ PATH="${I_MPI_ROOT}/intel64/bin:${PATH}"; export PATH
$ mpirun -n 2 -perhost 1 uname -n
hpc-class-40.its.iastate.edu
hpc-class-39.its.iastate.edu

As James suggested, I issued the same commands (for IntelMPI v.5.0.1.035) with I_MPI_HYDRA_DEBUG set to 1 (see attached file). What is interesting is that the first two lines of the output suggest that -perhost works (two different hostnames are printed), but at the end it's still printing the same hostname twice.

 

Attachments: 

AttachmentSize
Downloadtext/plain mpirun-perhost.txt42.2 KB

Marina,

In your case, it looks like the PBS* environment is overriding the -perhost option.  Can you run outside of PBS*?

Leave a Comment

Please sign in to add a comment. Not a member? Join today