Developer Reference

  • 2020
  • 09/11/2020
  • Public Content

Autotuning

Tuning is very dependent on the specifications of the particular platform. Intel carefully determines the tuning parameters for a limited set of platforms, and makes them available for autotuning using the
I_MPI_TUNING_MODE
environment variable.
For the full list of platforms supported by the
I_MPI_TUNING_MODE
environment variable, see Tuning Environment Variables. This variable has no effect on platforms not included in this list. For these platforms, use the I_MPI_TUNING_AUTO Family Environment Variables directly to find the best settings.
The autotuner functionality lets you automatically find the best algorithms for collective operations . The autotuner search space can be modified by
I_MPI_ADJUST_<opname>_LIST
variables from I_MPI_ADJUST Family Environment Variables.
The collectives currently available for autotuning are:
MPI_Allreduce, MPI_Bcast, MPI_Barrier, MPI_Reduce, MPI_Gather, MPI_Scatter, MPI_Alltoall, MPI_Allgatherv, MPI_Reduce_scatter, MPI_Reduce_scatter_block, MPI_Scan, MPI_Exscan, MPI_Iallreduce, MPI_Ibcast, MPI_Ibarrier, MPI_Ireduce, MPI_Igather, MPI_Iscatter, MPI_Ialltoall, MPI_Iallgatherv, MPI_Ireduce_scatter, MPI_Ireduce_scatter_block, MPI_Iscan,
and
MPI_Iexscan
.
To get started with autotuning, follow these steps:
  1. Launch the application with the autotuner enabled and specify the dump file, which stores results:
    I_MPI_TUNING_MODE=auto I_MPI_TUNING_BIN_DUMP=<tuning_results.dat>
  2. Launch the application with the tuning results generated at the previous step:
    I_MPI_TUNING_BIN=<tuning_results.dat>
    Or use the
    -tune
    Hydra option.
  3. If you experience performance issues, see Environment Variables for Autotuning.
For example:
  1. $ export I_MPI_TUNING_MODE=auto $ export I_MPI_TUNING_AUTO_SYNC=1 $ export I_MPI_TUNING_AUTO_ITER_NUM=5 $ export I_MPI_TUNING_BIN_DUMP=./tuning_results.dat $ mpirun -n 128 -ppn 64 IMB-MPI1 allreduce -iter 1000,800 -time 4800
  2. $ export I_MPI_TUNING_BIN=./tuning_results.dat $ mpirun -n 128 -ppn 64 IMB-MPI1 allreduce -iter 1000,800 -time 4800
To tune collectives on a communicator identified with the help of Application Performance Snapshot (APS), execute the following variable at step 1:
I_MPI_TUNING_AUTO_COMM_LIST=comm_id_1, ... , comm_id_n
.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804