Merging MPI Intercommunicators

Sometimes you may have completely separate instances for an MPI job.  You can connect these separate instances using techniques such as MPI_Comm_accept/MPI_Comm_connect, which creates an intercommunicator.  But in more complex situations, you may have a significant number of separate intercommunicators, and want to send data between arbitrary ranks, or perform collective operations across all ranks, or other functions which are most efficiently handled over a single intracommunicator.

In order to handle this situation, the MPI standard includes a function call MPI_Intercomm_merge.  This function takes three arguments.  The first argument is the intercommunicator to be merged.  The second argument is a boolean argument indicating whether the ranks will be numbered in the high range (true) or the low range (false) in the resulting intracommunicator.  The third argument is a pointer to the new intracommunicator.  When you call MPI_Intercomm_merge, you must call it from every rank in both sides of the intercommunicator, and all ranks on a particular side must have the same high/low argument.  The two sides can have the same or different values.  If the same, the resulting rank order will be arbitrary.  If the two are different, you will end up with the ranks with the low (false) argument having lower rank numbers, and the ranks with the high (true) argument having higher rank numbers.  For example, if you have an intercommunicator with 2 ranks on side A and 3 ranks on side B, and you call MPI_Intercomm_merge with false on side A and true on side B, the side A ranks will have new ranks 0 and 1, and the side B ranks will have rank numbers 2, 3, and 4.

In a more complex situation, you may need to merge multiple intercommunicators.  This can be done in one of several ways, depending on how your ranks join the intercommunicator.  If you have separate ranks joining independently, you can merge them as each joins, and use the resulting intracommunicator as the base intracommunicator for the newly joining ranks.

MPI_Comm_accept(port, MPI_INFO_NULL, 0, localcomm, &intercomm[0]);
MPI_Intercomm_merge(intercomm[0], false, &localcomm);

This will update localcomm to include all ranks as they join.  You can also merge them after all have joined.  This will require multiple steps of creating new intercommunicators to merge, but can also lead to the same end result.

Once this is done, you can now use collectives across the new intracommunicator as if you had started all ranks under the same intracommunicator originally.

For more complete information about compiler optimizations, see our Optimization Notice.