Developer Reference

  • 2020 Update 2
  • 07/15/2020
  • Public Content

Shared Memory Control

I_MPI_SHM
Select a shared memory transport to be used.

Syntax

I_MPI_SHM=<transport>

Arguments

<transport>
Define a shared memory transport solution.
disable | no | off | 0
Do not use shared memory transport.
auto
Select a shared memory transport solution automatically.
bdw_sse
The shared memory transport solution tuned for Intel® microarchitecture code name Broadwell. The SSE4.2. instruction set is used.
bdw_avx2
The shared memory transport solution tuned for Intel® microarchitecture code name Broadwell. The AVX2 instruction set is used.
skx_sse
The shared memory transport solution tuned for Intel® Xeon® processors based on Intel® microarchitecture code name Skylake. The CLFLUSHOPT and SSE4.2 instruction set is used.
skx_avx2
The shared memory transport solution tuned for Intel® Xeon® processors based on Intel® microarchitecture code name Skylake. The CLFLUSHOPT and AVX2 instruction set is used.
skx_avx512
The shared memory transport solution tuned for Intel® Xeon® processors based on Intel® microarchitecture code name Skylake. The CLFLUSHOPT and AVX512 instruction set is used.
knl_ddr
The shared memory transport solution tuned for Intel® microarchitecture code name Knights Landing.
knl_mcdram
The shared memory transport solution tuned for Intel® microarchitecture code name Knights Landing. Shared memory buffers may be partially located in the Multi-Channel DRAM (MCDRAM).
clx_sse
The shared memory transport solution tuned for Intel® Xeon® processors based on Intel® microarchitecture code name Cascade Lake. The CLFLUSHOPT and SSE4.2 instruction set is used.
clx_avx2
The shared memory transport solution tuned for Intel® Xeon® processors based on Intel® microarchitecture code name Cascade Lake. The CLFLUSHOPT and AVX2 instruction set is used.
clx_avx512
The shared memory transport solution tuned for Intel® Xeon® processors based on Intel® microarchitecture code name Cascade Lake. The CLFLUSHOPT and AVX512 instruction set is used.
clx-ap
The shared memory transport solution tuned for Intel® Xeon® processors based on Intel® microarchitecture code name Cascade Lake Advanced Performance.

Description

Set this environment variable to select a specific shared memory transport solution.
Automatically selected transports:
  • bdw_avx2
    for Intel® microarchitecture code name Haswell, Broadwell and Skylake
  • skx_avx2
    for Intel® Xeon® processors based on Intel® microarchitecture code name Skylake
  • ckx_avx2
    for Intel® Xeon® processors based on Intel® microarchitecture code name Cascade Lake
  • knl_mcdram
    for Intel® microarchitecture code name Knights Landing and Knights Mill
  • bdw_sse
    for all other platforms
The value of
I_MPI_SHM
depends on the value of
I_MPI_FABRICS
as follows: if
I_MPI_FABRICS
is
ofi
,
I_MPI_SHM
is disabled. If
I_MPI_FABRICS
is
shm:ofi
,
I_MPI_SHM
defaults to
auto
or takes the specified value.
I_MPI_SHM_CELL_FWD_SIZE
Change the size of a shared memory forward cell.

Syntax

I_MPI_SHM_CELL_FWD_SIZE=<nbytes>

Arguments

<nbytes>
The size of a shared memory forward cell in bytes
> 0
The default
<nbytes>
value depends on the transport used and should normally range from 64K to 1024K.

Description

Forward cells are in-cache message buffer cells used for sending small amounts of data. Lower values are recommended. Set this environment variable to define the size of a forward cell in the shared memory transport.
I_MPI_SHM_CELL_BWD_SIZE
Change the size of a shared memory backward cell.

Syntax

I_MPI_SHM_CELL_BWD_SIZE=<nbytes>

Arguments

<nbytes>
The size of a shared memory backward cell in bytes
> 0
The default
<nbytes>
value depends on the transport used and should normally range from 64K to 1024K.

Description

Backward cells are out-of-cache message buffer cells used for sending large amounts of data. Higher values are recommended. Set this environment variable to define the size of a backwrad cell in the shared memory transport.
I_MPI_SHM_CELL_EXT_SIZE
Change the size of a shared memory extended cell.

Syntax

I_MPI_SHM_CELL_EXT_SIZE=<nbytes>

Arguments

<nbytes>
The size of a shared memory extended cell in bytes
> 0
The default
<nbytes>
value depends on the transport used and should normally range from 64K to 1024K.

Description

Extended cells are used in the imbalanced applications when forward and backward cells are run out. An extended cell does not have a specific owner - it is shared between all ranks on the computing node. Set this environment variable to define the size of an extended cell in the shared memory transport.
I_MPI_SHM_CELL_FWD_NUM
Change the number of forward cells in the shared memory transport (per rank).

Syntax

I_MPI_SHM_CELL_FWD_NUM=<num>

Arguments

<num>
 The number of shared memory forward cells
> 0
The default value depends on the transport used and should normally range from 4 to 16.

Description

Set this environment variable to define the number of forward cells in the shared memory transport.
I_MPI_SHM_CELL_BWD_NUM
Change the number of backward cells in the shared memory transport (per rank).

Syntax

I_MPI_SHM_CELL_BWD_NUM=<num>

Arguments

<num>
The number of shared memory backward cells
> 0
The default value depends on the transport used and should normally range from 4 to 64.

Description

Set this environment variable to define the number of backward cells in the shared memory transport.
I_MPI_SHM_CELL_EXT_NUM_TOTAL
Change the total number of extended cells in the shared memory transport.

Syntax

I_MPI_SHM_CELL_EXT_NUM_TOTAL=<num>

Arguments

<num>
The number of shared memory backward cells
> 0
The default value depends on the transport used and should normally range from 2K to 8K.

Description

Set this environment variable to define the number of extended cells in the shared memory transport.
This is not “per rank” number, it is total number of extended cells on the computing node.
I_MPI_SHM_CELL_FWD_HOLD_NUM
Change the number of hold forward cells in the shared memory transport (per rank).

Syntax

I_MPI_SHM_CELL_FWD_HOLD_NUM=<num>

Arguments

<num>
The number of shared memory hold forward cells
> 0
The default value depends onthe transport used and must be less than
I_MPI_SHM_CELL_FWD_NUM
.

Description

Set this environment variable to define the number of forward cells in the shared memory transport a rank can hold at the same time. Recommended values are powers of two in the range between 1 and 8.
I_MPI_SHM_MCDRAM_LIMIT
Change the size of the shared memory bound to the multi-channel DRAM (MCDRAM) (size per rank).

Syntax

I_MPI_SHM_MCDRAM_LIMIT=<nbytes>

Arguments

<nbytes>
The size of the shared memory bound to MCDRAM per rank
1048576
This is the default value.

Description

Set this environment variable to define how much MCDRAM memory per rank is allowed for the shared memory transport. This variable takes effect with
I_MPI_SHM=knl_mcdram
only.
I_MPI_SHM_SEND_SPIN_COUNT
Control the spin count value for the shared memory transport for sending messages.

Syntax

I_MPI_SHM_SEND_SPIN_COUNT=<count>

Arguments

<count>
Define the spin count value. A typical value range is between 1 and 1000.

Description

If the recipient ingress buffer is full, the sender may be blocked until this spin count value is reached. It has no effect when sending small messages.
I_MPI_SHM_RECV_SPIN_COUNT
Control the spin count value for the shared memory transport for receiving messages.

Syntax

I_MPI_SHM_RECV_SPIN_COUNT=<count>

Arguments

<count>
Define the spin count value. A typical value range is between 1 and 1000000.

Description

If the receive is non-blocking, this spin count is used only for safe reorder of expected and unexpected messages.  It has no effect on receiving small messages.
I_MPI_SHM_FILE_PREFIX_4K
Change the mount point of the 4 KB pages size file system (
tmpfs
) where the shared memory files are created.

Syntax

I_MPI_SHM_FILE_PREFIX_4K=<path>

Arguments

<path>
Define the path to the existed mount point of the 4 KB pages size file system (
tmpfs
). By default, the path is not set.

Description

Set this environment variable to define a new path to the shared memory files. By default, the shared memory files are created at
/dev/shm/
.
This variable affects shared memory transport buffers and RMA windows.
Example
I_MPI_SHM_FILE_PREFIX_4K=/dev/shm/intel/
I_MPI_SHM_FILE_PREFIX_2M
Change the mount point of the 2 MB pages size file system (
hugetlbfs
) where the shared memory files are created.

Syntax

I_MPI_SHM_FILE_PREFIX_2M=<path>

Arguments

<path>
Define the path to the existed mount point of the 2 MB pages size file system (
hugetlbfs
). By default, the path is not set.

Description

Set this environment variable to enable 2 MB huge pages on the Intel MPI Library.
The variable affects shared memory transport buffers. It may affect RMA windows as well if the windows size is greater than or equal to 2 MB.
Example
I_MPI_SHM_FILE_PREFIX_2M=/dev/hugepages
The root privileges are required to configure the huge pages subsystem. Contact your system administrator to obtain permission.
I_MPI_SHM_FILE_PREFIX_1G
Change the mount point of the 1 GB pages size file system (
hugetlbfs
) where the shared memory files are created.

Syntax

I_MPI_SHM_FILE_PREFIX_1G=<path>

Arguments

<path>
Define the path to the existed mount point of the 1 GB pages size file system (
hugetlbfs
). By default, the path is not set.

Description

Set this environment variable to enable 1 GB huge pages on the Intel MPI Library.
The variable affects shared memory transport buffers. It may affect RMA windows as well if the windows size is greater than or equal to 1 GB.
Example
I_MPI_SHM_FILE_PREFIX_1G=/dev/hugepages1G
The root privileges are required to configure the huge pages subsystem. Contact your system administrator to obtain permission.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804