Intel® MPI Library supports asynchronous progress threads that allow you to manage communication in parallel with application computation and, as a result, achieve better communication/computation overlapping. This feature is supported for the release_mt and debug_mt versions only.
Asynchronous progress has a full support for MPI point-to-point operations, blocking collectives, and a partial support for non-blocking collectives (MPI_Ibcast, MPI_Ireduce, and MPI_Iallreduce).
To enable asynchronous progress, pass 1 to the I_MPI_ASYNC_PROGRESS environment variable. You can define the number of asynchronous progress threads by setting the I_MPI_ASYNC_PROGRESS_THREADS environment variable. The I_MPI_ASYNC_PROGRESS_ID_KEY variable sets the MPI info object key that is used to define the progress thread_id for a communicator.
Setting the I_MPI_ASYNC_PROGRESS_PIN environment variable allows you to control the pinning of the asynchronous progress threads. In case of N progress threads per process, the first N logical processors from the list will be assigned to the threads of the first local process, while the next N logical processors - to the second local process and so on.
For example, If the thread affinity is 0,1,2,3 with 2 progress threads per process and 2 processes per node, then the progress threads of the first local process are pinned to logical processors 0 and 1, while the progress threads of the second local process are pinned to processors 2 and 3.
The code example is available below or in the async_progress_sample.c file in the doc/examples subdirectory of the package.
For more information on environment variables, refer to the Intel® MPI Library Developer Reference, section Environment Variable Reference > Environment Variable Reference for Asynchronous Progress Control.