Developer Reference

  • 2020 Update 2
  • 07/15/2020
  • Public Content

I_MPI_ADJUST Family Environment Variables

I_MPI_ADJUST_
<opname>
Control collective operation algorithm selection.

Syntax

I_MPI_ADJUST_<opname>="<algid>[:<conditions>][;<algid>:<conditions>[...]]"

Arguments

<algid>
Algorithm identifier
>= 0
The default value of zero selects the reasonable settings
 
<conditions>
A comma separated list of conditions. An empty list selects all message sizes and process combinations
<l>
Messages of size
<l>
<l>-<m>
Messages of size from
<l>
to
<m>
, inclusive
<l>@<p>
Messages of size
<l>
and number of processes
<p>
<l>-<m>@<p>-<q>
Messages of size from
<l>
to
<m>
and number of processes from
<p>
to
<q>
, inclusive

Description

Set this environment variable to select the desired algorithm(s) for the collective operation
<opname>
under particular conditions. Each collective operation has its own environment variable and algorithms.
Environment Variables, Collective Operations, and Algorithms
 
 
 
Environment Variable
Collective Operation
Algorithms
I_MPI_ADJUST_ALLGATHER
MPI_Allgather
  1. Recursive doubling
  2. Bruck's
  3. Ring
  4. Topology aware Gatherv + Bcast
  5. Knomial
I_MPI_ADJUST_ALLGATHERV
MPI_Allgatherv
  1. Recursive doubling
  2. Bruck's
  3. Ring
  4. Topology aware Gatherv + Bcast
I_MPI_ADJUST_ALLREDUCE
MPI_Allreduce
  1. Recursive doubling
  2. Rabenseifner's
  3. Reduce + Bcast
  4. Topology aware Reduce + Bcast
  5. Binomial gather + scatter
  6. Topology aware binominal gather + scatter
  7. Shumilin's ring
  8. Ring
  9. Knomial
  10. Topology aware SHM-based flat
  11. Topology aware SHM-based Knomial
  12. Topology aware SHM-based Knary
I_MPI_ADJUST_ALLTOALL
MPI_Alltoall
  1. Bruck's
  2. Isend/Irecv + waitall
  3. Pair wise exchange
  4. Plum's
I_MPI_ADJUST_ALLTOALLV
MPI_Alltoallv
  1. Isend/Irecv + waitall
  2. Plum's
I_MPI_ADJUST_ALLTOALLW
MPI_Alltoallw
Isend/Irecv + waitall
I_MPI_ADJUST_BARRIER
MPI_Barrier
  1. Dissemination
  2. Recursive doubling
  3. Topology aware dissemination
  4. Topology aware recursive doubling
  5. Binominal gather + scatter
  6. Topology aware binominal gather + scatter
  7. Topology aware SHM-based flat
  8. Topology aware SHM-based Knomial
  9. Topology aware SHM-based Knary
I_MPI_ADJUST_BCAST
MPI_Bcast
  1. Binomial
  2. Recursive doubling
  3. Ring
  4. Topology aware binomial
  5. Topology aware recursive doubling
  6. Topology aware ring
  7. Shumilin's
  8. Knomial
  9. Topology aware SHM-based flat
  10. Topology aware SHM-based Knomial
  11. Topology aware SHM-based Knary
  12. NUMA aware SHM-based (SSE4.2)
  13. NUMA aware SHM-based (AVX2)
  14. NUMA aware SHM-based (AVX512)
I_MPI_ADJUST_EXSCAN
MPI_Exscan
  1. Partial results gathering
  2. Partial results gathering regarding layout of processes
I_MPI_ADJUST_GATHER
MPI_Gather
  1. Binomial
  2. Topology aware binomial
  3. Shumilin's
  4. Binomial with segmentation
I_MPI_ADJUST_GATHERV
MPI_Gatherv
  1. Linear
  2. Topology aware linear
  3. Knomial
I_MPI_ADJUST_REDUCE_SCATTER
MPI_Reduce_scatter
  1. Recursive halving
  2. Pair wise exchange
  3. Recursive doubling
  4. Reduce + Scatterv
  5. Topology aware Reduce + Scatterv
I_MPI_ADJUST_REDUCE
MPI_Reduce
  1. Shumilin's
  2. Binomial
  3. Topology aware Shumilin's
  4. Topology aware binomial
  5. Rabenseifner's
  6. Topology aware Rabenseifner's
  7. Knomial
  8. Topology aware SHM-based flat
  9. Topology aware SHM-based Knomial
  10. Topology aware SHM-based Knary
  11. Topology aware SHM-based binomial
I_MPI_ADJUST_SCAN
MPI_Scan
  1. Partial results gathering
  2. Topology aware partial results gathering
I_MPI_ADJUST_SCATTER
MPI_Scatter
  1. Binomial
  2. Topology aware binomial
  3. Shumilin's
I_MPI_ADJUST_SCATTERV
MPI_Scatterv
  1. Linear
  2. Topology aware linear
I_MPI_ADJUST_IALLGATHER
MPI_Iallgather
  1. Recursive doubling
  2. Bruck’s
  3. Ring
I_MPI_ADJUST_IALLGATHERV
MPI_Iallgatherv
  1. Recursive doubling
  2. Bruck’s
  3. Ring
I_MPI_ADJUST_IALLREDUCE
MPI_Iallreduce
  1. Recursive doubling
  2. Rabenseifner’s
  3. Reduce + Bcast
  4. Ring (patarasuk)
  5. Knomial
  6. Binomial
  7. Reduce scatter allgather
  8. SMP
  9. Nreduce
I_MPI_ADJUST_IALLTOALL
MPI_Ialltoall
  1. Bruck’s
  2. Isend/Irecv + Waitall
  3. Pairwise exchange
I_MPI_ADJUST_IALLTOALLV
MPI_Ialltoallv
Isend/Irecv + Waitall
I_MPI_ADJUST_IALLTOALLW
MPI_Ialltoallw
Isend/Irecv + Waitall
I_MPI_ADJUST_IBARRIER
MPI_Ibarrier
Dissemination
I_MPI_ADJUST_IBCAST
MPI_Ibcast
  1. Binomial
  2. Recursive doubling
  3. Ring
  4. Knomial
  5. SMP
  6. Tree knominal
  7. Tree kary
I_MPI_ADJUST_IEXSCAN
MPI_Iexscan
  1. Recursive doubling
  2. SMP
I_MPI_ADJUST_IGATHER
MPI_Igather
  1. Binomial
  2. Knomial
I_MPI_ADJUST_IGATHERV
MPI_Igatherv
  1. Linear
  2. Linear ssend
I_MPI_ADJUST_IREDUCE_SCATTER
MPI_Ireduce_scatter
  1. Recursive halving
  2. Pairwise
  3. Recursive doubling
I_MPI_ADJUST_IREDUCE
MPI_Ireduce
  1. Rabenseifner’s
  2. Binomial
  3. Knomial
I_MPI_ADJUST_ISCAN
MPI_Iscan
  1. Recursive Doubling
  2. SMP
I_MPI_ADJUST_ISCATTER
MPI_Iscatter
  1. Binomial
  2. Knomial
I_MPI_ADJUST_ISCATTERV
MPI_Iscatterv
Linear
The message size calculation rules for the collective operations are described in the table. In the following table, "n/a" means that the corresponding interval
<l>-<m>
should be omitted.
Message Collective Functions
Collective Function
Message Size Formula
MPI_Allgather
recv_count*recv_type_size
MPI_Allgatherv
total_recv_count*recv_type_size
MPI_Allreduce
count*type_size
MPI_Alltoall
send_count*send_type_size
MPI_Alltoallv
n/a
MPI_Alltoallw
n/a
MPI_Barrier
n/a
MPI_Bcast
count*type_size
MPI_Exscan
count*type_size
MPI_Gather
recv_count*recv_type_size
if
MPI_IN_PLACE
is used, otherwise
send_count*send_type_size
MPI_Gatherv
n/a
MPI_Reduce_scatter
total_recv_count*type_size
MPI_Reduce
count*type_size
MPI_Scan
count*type_size
MPI_Scatter
send_count*send_type_size
if
MPI_IN_PLACE
is used, otherwise
recv_count*recv_type_size
MPI_Scatterv
n/a

Examples

Use the following settings to select the second algorithm for
MPI_Reduce
operation:
I_MPI_ADJUST_REDUCE=2
Use the following settings to define the algorithms for
MPI_Reduce_
scatter
operation:
I_MPI_ADJUST_REDUCE_SCATTER="4:0-100,5001-10000;1:101-3200;2:3201-5000;3"
In this case. algorithm 4 is used for the message sizes between 0 and 100 bytes and from 5001 and 10000 bytes, algorithm 1 is used for the message sizes between 101 and 3200 bytes, algorithm 2 is used for the message sizes between 3201 and 5000 bytes, and algorithm 3 is used for all other messages.

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

Syntax

Arguments

Description

I_MPI_ADJUST_<
opname
>_LIST

Syntax

I_MPI_ADJUST_<
opname
>_LIST=<algid1>[-<algid2>][,<algid3>][,<algid4>-<algid5>]

Description

Set this environment variable to specify the set of algorithms to be considered by the Intel MPI runtime for a specified <
opname
>. This variable is useful in autotuning scenarios, as well as tuning scenarios where users would like to select a certain subset of algorithms.
Setting an empty string disables autotuning for the <
opname
> collective.
I_MPI_COLL_INTRANODE

Syntax

I_MPI_COLL_INTRANODE=<mode>

Arguments

<mode> 
Intranode collectives type
pt2pt
Use only point-to-point communication-based collectives
shm
Enables shared memory collectives. This is the default value

Description

Set this environment variable to switch intranode communication type for collective operations. If there is large set of communicators, you can switch off the SHM-collectives to avoid memory overconsumption.
I_MPI_COLL_INTRANODE_SHM_THRESHOLD

Syntax

I_MPI_COLL_INTRANODE_SHM_THRESHOLD=<nbytes>

Arguments

<nbytes> 
Define the maximal data block size processed by shared memory collectives.
> 0
Use the specified size. The default value is 16384 bytes.

Description

Set this environment variable to define the size of shared memory area available for each rank for data placement. Messages greater than this value will
not
be processed by SHM-based collective operation, but will be processed by point-to-point based collective operation. The value must be a multiple of 4096.
I_MPI_COLL_EXTERNAL

Syntax

I_MPI_COLL_EXTERNAL=<arg>

Arguments

<arg> 
Binary  indicator.
enable | yes | on | 1
Enable the external collective operations functionality.
disable | no | off | 0
Disable the external collective operations functionality. This is the default value.

Description

Set this environment variable to enable external collective operations. The mechanism allows to enable HCOLL. The functionality enables the following collective operations:
I_MPI_ADJUST_ALLREDUCE=24, I_MPI_ADJUST_BARRIER=11, I_MPI_ADJUST_BCAST=16, I_MPI_ADJUST_REDUCE=13, I_MPI_ADJUST_ALLGATHER=6, I_MPI_ADJUST_ALLTOALL=5, I_MPI_ADJUST_ALLTOALLV=5,
I_MPI_ADJUST_SCAN=3, I_MPI_ADJUST_EXSCAN=3, I_MPI_ADJUST_GATHER=5, I_MPI_ADJUST_GATHERV=4, I_MPI_ADJUST_SCATTER=5, I_MPI_ADJUST_SCATTERV=4, I_MPI_ADJUST_ALLGATHERV=5, I_MPI_ADJUST_ALLTOALLW=2, I_MPI_ADJUST_REDUCE_SCATTER=6, I_MPI_ADJUST_REDUCE_SCATTER_BLOCK=4, I_MPI_ADJUST_IALLGATHER=5, I_MPI_ADJUST_IALLGATHERV=5, I_MPI_ADJUST_IGATHERV=3, I_MPI_ADJUST_IALLREDUCE=9, I_MPI_ADJUST_IALLTOALLV=2, I_MPI_ADJUST_IBARRIER=2, I_MPI_ADJUST_IBCAST=5, I_MPI_ADJUST_IREDUCE=4.

Syntax

Arguments

Description

I_MPI_CBWR
Control reproducibility of floating-point operations results across different platforms, networks, and topologies in case of the same number of processes.

Syntax

I_MPI_CBWR=<arg>

Arguments

<arg>
CBWR compatibility mode
Description
0
None
Do not use CBWR in a library-wide mode. CNR-safe communicators may be created with
MPI_Comm_dup_with_info
explicitly. This is the default value.
1
Weak mode
Disable topology aware collectives. The result of a collective operation does not depend on the rank placement. The mode guarantees results reproducibility across different runs on the same cluster (independent of the rank placement).
2
Strict mode
Disable topology aware collectives, ignore CPU architecture, and interconnect during algorithm selection. The mode guarantees results reproducibility across different runs on different clusters (independent of the rank placement, CPU architecture, and interconnection)

Description

Conditional Numerical Reproducibility (CNR) provides controls for obtaining reproducible floating-point results on collectives operations. With this feature, Intel MPI collective operations are designed to return the same floating-point results from run to run in case of the same number of MPI ranks.
Control this feature with the
I_MPI_CBWR
environment variable in a library-wide manner, where all collectives on all communicators are guaranteed to have reproducible results. To control the floating-point operations reproducibility in a more precise and per-communicator way, pass the
{"I_MPI_CBWR", "yes"}
key-value pair to the
MPI_Comm_dup_with_info
call.
Setting the
I_MPI_CBWR
in a library-wide mode using the environment variable leads to performance penalty.
CNR-safe communicators created using
MPI_Comm_dup_with_info
always work in the strict mode. For example:
MPI_Info hint; MPI_Comm cbwr_safe_world, cbwr_safe_copy; MPI_Info_create(&hint); MPI_Info_set(hint, “I_MPI_CBWR”, “yes”); MPI_Comm_dup_with_info(MPI_COMM_WORLD, hint, & cbwr_safe_world); MPI_Comm_dup(cbwr_safe_world, & cbwr_safe_copy);
In the example above, both cbwr_safe_world and cbwr_safe_copy are CNR-safe. Use cbwr_safe_world and its duplicates to get reproducible results for critical operations.
Note that
MPI_COMM_WORLD
itself may be used for performance-critical operations without reproducibility limitations.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804