MPI -rr (round robin) and perhost settings with machinefile on windows MPI 4

MPI -rr (round robin) and perhost settings with machinefile on windows MPI 4

mcapogreco's picture

Hi,

Im trying to best setup my mpiexec command so that I can do mpi call

1. Run mpiexec against a hostfile so that it only runs once on each host in a machinefile, for data sending purposes

2. once 1 is completed run mpiexec against the same hostfile using up to the max number of cores on each host for calculation purposes.

I see the -rr and -perhost is not working on the windows mpiexec.

Also, if possible I would like to combine this 2 exes into one mpiexec call.

Cheers

Mark

6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
James Tullos (Intel)'s picture

Hi Mark,

The options -rr and -perhost are options for the Linux* version of the Intel MPI Library only. In Windows*, there are other options which will work.

Using -machinefile will specify a file with a list of hosts that will be used for the job. This will automatically use a round-robin approach. You can also specify the number of processes per host by appending ":" to the hostname. From the Reference Manual:

host1

host1

host2

host2

host3
is equivalent to:
host1:2

host2:2

host3

You can also use -configfile to use multiple option sets for each host in the job. Something such as:

-host host1 -n 1 program.exe

-host host2 -n 1 program.exe

As to how to combine your two jobs, that really depends on the jobs. What exactly do you mean by "data sending purposes" in the first step?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

mcapogreco's picture

Hi James,

Thanks for the feedback.

Basically I need to do a round robin (-perhost 1) approach on all hosts in the machinefile so that I can transfer the data to each PC only once, and once this is finished I run another mpiexec exe so that each host can have the MAX processes running on it doing a calculation addressing the data sent from the previous mpiexec.

I would like to do this with both mpiexec's addressing the same machinefile for consistency reasons.

I guess the main issue is running the first mpiexec only once per host when the machinefile is setup like

host1:MAX

host2:MAX

host3:MAX

Thanks for help.

Cheers

MArk

James Tullos (Intel)'s picture

Hi Mark,

Would it be possible to have the data only reside on one computer, and have a single rank read the data and then transfer it directly to each of the processes? This would skip the first run entirely.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

mcapogreco's picture

Hi James,

This is how I setup my calculation initially and worked well except when the amount of data got large and we were using machines over a WAN. Ideally I just want 1 data push to each host which reduces traffic and a bottleneck on the Network Card.

Trying to find a smart way to do this only using the one hostfile, which will provide consistency and guaranty that every host used for calculation already has data pushed into shared memory on the host.

Thanks for your help.

Cheers

MArk

James Tullos (Intel)'s picture

Hi Mark,

I tested this today, you can use the host:nprocs form and specify only one process for each host. If you specify more processes than are available, the next process will go back to the first host on the list, similar to the -rr option in Linux*.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Login to leave a comment.