Developer Reference

  • 2020 Update 2
  • 07/15/2020
  • Public Content

Other Environment Variables

I_MPI_DEBUG
Print out debugging information when an MPI program starts running.

Syntax

I_MPI_DEBUG=<level>[,<flags>]

Arguments

<level>
Indicate the level of debug information provided
0
Output no debugging information. This is the default value.
1,2
Output libfabric* version and provider.
3
Output effective MPI rank,
pid
and node mapping table.
4
Output process pinning information.
5
Output environment variables specific to Intel® MPI Library.
> 5
Add extra levels of debug information.
<flags>
Comma-separated list of debug flags
pid
Show process id for each debug message.
tid
Show thread id for each debug message for multithreaded library.
time
Show time for each debug message.
datetime
Show time and date for each debug message.
host
Show host name for each debug message.
level
Show level for each debug message.
scope
Show scope for each debug message.
line
Show source line number for each debug message.
file
Show source file name for each debug message.
nofunc
Do not show routine name.
norank
Do not show rank.
flock
Synchronize debug output from different process or threads.
nobuf
Do not use buffered I/O for debug output.

Description

Set this environment variable to print debugging information about the application.
Set the same
<level>
value for all ranks.
You can specify the output file name for debug information by setting the
I_MPI_DEBUG_OUTPUT
environment variable.
Each printed line has the following format:
[<identifier>] <message>
where:
  • <identifier>
    is the MPI process rank, by default. If you add the '
    +
    ' sign in front of the
    <level>
    number, the
    <identifier>
    assumes the following format:
    rank#pid@hostname
    . Here,
    rank
    is the MPI process rank,
    pid
    is the UNIX* process ID, and
    hostname
    is the host name. If you add the '
    -
    ' sign,
    <identifier>
    is not printed at all.
  • <message>
    contains the debugging output.
The following examples demonstrate possible command lines with the corresponding output:
$ mpirun -n 1 -env I_MPI_DEBUG=2 ./a.out ... [0] MPI startup(): shared memory data transfer mode
The following commands are equal and produce the same output:
$ mpirun -n 1 -env I_MPI_DEBUG=+2 ./a.out $ mpirun -n 1 -env I_MPI_DEBUG=2,pid,host ./a.out ... [0#1986@mpicluster001] MPI startup(): shared memory data transfer mode
Compiling with the
-g
option adds a considerable amount of printed debug information.
I_MPI_DEBUG_OUTPUT
Set output file name for debug information.

Syntax

I_MPI_DEBUG_OUTPUT=<arg>

Arguments

<arg>
String value
stdout
Output to
stdout
. This is the default value.
stderr
Output to
stderr
.
<file_name>
Specify the output file name for debug information (the maximum file name length is 256 symbols).

Description

Set this environment variable if you want to split output of debug information from the output produced by an application. If you use format like
%r
,
%p
or
%h
, rank, process ID or host name is added to the file name accordingly.

Syntax

Arguments

Description

Examples

I_MPI_STATS
Collect MPI statistics from your application using Application Performance Snapshot.

Syntax

I_MPI_STATS=<level>

Arguments

<level>
Indicate the level of statistics collected
1,2,3,4,5
Specify the level to indicate amount of MPI statistics to be collected by Application Performance Snapshot (APS).
The full description of levels is available in the official APS documentation .

Description

Set this variable to collect MPI-related statistics from your MPI application using Application Performance Snapshot. The variable creates a new folder
aps_result_<date>-<time>
containing statistics data. To analyze the collected data, use the
aps
utility. For example:
$ export I_MPI_STATS=5 $ mpirun -n 2 ./myApp $ aps-report aps_result_20171231_235959
I_MPI_STARTUP_MODE
Select a mode for the Intel® MPI Library process startup algorithm.

Syntax

I_MPI_STARTUP_MODE=<arg>

Arguments

<arg>
String value
pmi_shm
Use shared memory to reduce the number of PMI calls. This mode is enabled by default.
pmi_shm_netmod
Use the
netmod
infrastructure for address exchange logic in addition to PMI and shared memory.

Description

The
pmi_shm
and
pmi_shm_netmod
modes reduce the application startup time. The efficiency of the modes is more clearly observed with the higher
-ppn
value, while there is no improvement at all with
-ppn 1
.
I_MPI_PMI_LIBRARY
Specify the name to third party implementation of the PMI library.

Syntax

I_MPI_PMI_LIBRARY=<name>

Arguments

<name>
Full name of the third party PMI library

Description

Set
I_MPI_PMI_LIBRARY
to specify the name of third party PMI library. When you set this environment variable, provide full name of the library with full path to it.
Currently supported PMI versions: PMI1, PMI2
I_MPI_PMI_VALUE_LENGTH_MAX
Control the length of the value buffer in PMI on the client side.

Syntax

I_MPI_PMI_VALUE_LENGTH_MAX=<length>

Arguments

<length>
Define the value of the buffer length in bytes.
<n> > 0
 The default value is -1, which means do not override the value received from the
PMI_KVS_Get_value_length_max()
function.

Description

Set this environment variable to control the length of the value buffer in PMI on the client side. The length of the buffer will be the lesser of  
I_MPI_PMI_VALUE_LENGTH_MAX
and
PMI_KVS_Get_value_length_max().
I_MPI_OUTPUT_CHUNK_SIZE
Set the size of the
stdout/stderr
output buffer.

Syntax

I_MPI_OUTPUT_CHUNK_SIZE=<size>

Arguments

<size>
Define output chunk size in kilobytes
<n>
> 0
The default chunk size value is 1 KB

Description

Set this environment variable to increase the size of the buffer used to intercept the standard output and standard error streams from the processes. If the
<size>
value is not greater than zero, the environment variable setting is ignored and a warning message is displayed.
Use this setting for applications that create a significant amount of output from different processes. With the
-ordered-output
option of
mpiexec.hydra
, this setting helps to prevent the output from garbling.
Set the
I_MPI_OUTPUT_CHUNK_SIZE
environment variable in the shell environment before executing the
mpiexec.hydra
/
mpirun
command. Do not use the
-genv
or
-env
options for setting the
<size>
value. Those options are used only for passing environment variables to the MPI process environment.
I_MPI_REMOVED_VAR_WARNING
Print out a warning if a removed environment variable is set.

Syntax

I_MPI_REMOVED_VAR_WARNING=<arg>

Arguments

<arg>
Binary indicator
enable | yes | on | 1
Print out the warning. This is the default value
disable | no | off | 0
Do not print the warning

Description

Use this environment variable to print out a warning if a removed environment variable is set. Warnings are printed regardless of whether
I_MPI_DEBUG
is set.
I_MPI_VAR_CHECK_SPELLING
Print out a warning if an unknown environment variable is set.

Syntax

I_MPI_VAR_CHECK_SPELLING=<arg>

Arguments

<arg>
Binary indicator
enable | yes | on | 1
Print out the warning. This is the default value
disable | no | off | 0
Do not print the warning

Description

Use this environment variable to print out a warning if an unsupported environment variable is set. Warnings are printed in case of removed or misprinted environment variables.
I_MPI_LIBRARY_KIND
Specify the Intel® MPI Library configuration.

Syntax

I_MPI_LIBRARY_KIND=<value>

Arguments

<value>
Binary indicator
release
Multi-threaded optimized library (with the global lock). This is the default value
debug
Multi-threaded debug library (with the global lock)
release_mt
Multi-threaded optimized library (with per-object lock for the thread-split model)
debug_mt
Multi-threaded debug library (with per-object lock for the thread-split model)

Description

Use this variable to set an argument for the
vars.[c]sh
script. This script establishes the Intel® MPI Library environment and enables you to specify the appropriate library configuration. To ensure that the desired configuration is set, check the
LD_LIBRARY_PATH
variable.
Example
$ export I_MPI_LIBRARY_KIND=debug
Setting this variable is equivalent to passing an argument directly to the
vars.[c]sh
script:
Example
$ . <installdir>/bin/vars.sh release
I_MPI_PLATFORM
Select the intended optimization platform.

Syntax

I_MPI_PLATFORM=<platform>

Arguments

<platform>
Intended optimization platform (string value)
auto[:min]
Optimize for the oldest supported Intel® Architecture Processor across all nodes
auto:max
Optimize for the newest supported Intel® Architecture Processor across all nodes
auto:most
Optimize for the most numerous Intel® Architecture Processor across all nodes. In case of a tie, choose the newer platform
ivb
Optimize for the Intel® Xeon® Processors E3, E5, and E7 V2 series and other Intel® Architecture processors formerly code named Ivy Bridge
hsw
Optimize for the Intel® Xeon® Processors E3, E5, and E7 V3 series and other Intel® Architecture processors formerly code named Haswell
bdw
Optimize for the Intel® Xeon® Processors E3, E5, and E7 V4 series and other Intel® Architecture processors formerly code named Broadwell
knl
Optimize for the Intel® Xeon Phi™ processor and coprocessor formerly code named Knights Landing
skx
Optimize for the Intel® Xeon® Processors E3 V5 and Intel® Xeon® Scalable Family series, and other Intel® Architecture processors formerly code named Skylake
clx
Optimize for the 2nd Generation Intel® Xeon® Scalable Processors, and other Intel® Architecture processors formerly code named Cascade Lake.
clx-ap
Optimize for the 2nd Generation Intel® Xeon® Scalable Processors, and other Intel® Architecture processors formerly code named Cascade Lake AP Note: The explicit
clx-ap
setting is ignored if the actual platform is not Intel.

Description

Set this environment variable to use the predefined platform settings. The default value is a local platform for each node.
The variable is available for both Intel® and non-Intel microprocessors, but it may utilize additional optimizations for Intel microprocessors than it utilizes for non-Intel microprocessors.
The values
auto[:min]
,
auto:max
, and
auto:most
may increase the MPI job startup time.

Syntax

Argument

Description

I_MPI_MALLOC
Control the Intel® MPI Library custom allocator of private memory.

Syntax

I_MPI_MALLOC=<arg>

Argument

/p>
<arg>
Binary indicator
1
Enable the Intel MPI Library custom allocator of private memory.
Use the Intel MPI custom allocator of private memory for
MPI_Alloc_mem/MPI_Free_mem
.
0
Disable the Intel MPI Library custom allocator of private memory.
Use the system-provided memory allocator for
MPI_Alloc_mem/MPI_Free_mem
.

Description

Use this environment variable to enable or disable the Intel MPI Library custom allocator of private memory for
MPI_Alloc_mem/MPI_Free_mem
.
By default,
I_MPI_MALLOC
is enabled for
release
and
debug
Intel MPI library configurations and disabled for
release_mt
and
debug_mt
configurations.
If the platform is not supported by the Intel MPI Library custom allocator of private memory, a system-provided memory allocator is used and the
I_MPI_MALLOC
variable is ignored.
I_MPI_SHM_HEAP
Control the Intel® MPI Library custom allocator of shared memory.

Syntax

I_MPI_SHM_HEAP=<arg>

Argument

/p>
<arg>
Binary indicator
1
Use the Intel MPI custom allocator of shared memory for
MPI_Alloc_mem/MPI_Free_mem
.
0
Do not use the Intel MPI custom allocator of shared memory for
MPI_Alloc_mem/MPI_Free_mem
.

Description

Use this environment variable to enable or disable the Intel MPI Library custom allocator of shared memory for
MPI_Alloc_mem/MPI_Free_mem
.
By default,
I_MPI_SHM_HEAP
is disabled. If enabled, it can improve performance of the shared memory transport because in that case it is possible to make only one memory copy operation instead of two copy-in/copy-out memory copy operations. If both
I_MPI_SHM_HEAP
and
I_MPI_MALLOC
are enabled, the shared memory allocator is used first. The private memory allocator is used only when required volume of shared memory is not available.
Details
By default, the shared memory segment is allocated on
tmpfs
file system on the
/dev/shm/
mount point. Starting from Linux kernel 4.7, it is possible to enable transparent huge pages on the shared memory. If Intel MPI Library shared memory heap is used, it is recommended to enable transparent huge pages on your system. To enable transparent huge pages on
/dev/shm
, please contact your system administrator or execute the following command:
sudo mount -o remount,huge=advise /dev/shm
In order to use another
tmpfs
mount point instead of
/dev/shm/
, use
I_MPI_SHM_FILE_PREFIX_4K
,
I_MPI_SHM_FILE_PREFIX_2M
, and
I_MPI_SHM_FILE_PREFIX_1G
.
If your application does not use
MPI_Alloc_mem/MPI_Free_mem
directly, you can override standard
malloc/calloc/realloc/free
procedures by preloading the
libmpi_shm_heap_proxy.so
library:
export LD_PRELOAD=$I_MPI_ROOT/lib/libmpi_shm_heap_proxy.so export I_MPI_SHM_HEAP=1
In this case, the
malloc/calloc/realloc
is a proxy for
MPI_Alloc_mem
and
free
is a proxy for
MPI_Free_mem
.
If the platform is not supported by the Intel MPI Library custom allocator of shared memory, the
I_MPI_SHM_HEAP
variable is ignored.
I_MPI_SHM_HEAP_VSIZE
Change the size (per rank) of virtual shared memory available for the Intel MPI Library custom allocator of shared memory.

Syntax

I_MPI_SHM_HEAP_VSIZE=<size>

Argument

/p>
<size>
The size (per rank) of shared memory used in shared memory heap (in megabytes).
>0
If shared memory heap is enabled for
MPI_Alloc_mem/MPI_Free_mem
, the default value is
4096
.

Description

Intel MPI Library custom allocator of shared memory works with fixed size virtual shared memory. The shared memory segment is allocated on
MPI_Init
and cannot be enlarged later.
The
I_MPI_SHM_HEAP_VSIZE=0
completely disables the Intel MPI Library shared memory allocator.
I_MPI_SHM_HEAP_CSIZE
Change the size (per rank) of shared memory cached in the Intel MPI Library custom allocator of shared memory.

Syntax

I_MPI_SHM_HEAP_CSIZE=<size>

Argument

/p>
<size>
The size (per rank) of shared memory used in Intel MPI Library shared memory allocator (in megabytes).
>0
It depends on the available shared memory size and number of ranks. Normally, the size is less than
256
.

Description

Small values of
I_MPI_SHM_HEAP_CSIZE
may reduce overall shared memory consumption. Larger values of this variable may speed up
MPI_Alloc_mem/MPI_Free_mem
.
I_MPI_SHM_HEAP_OPT
Change the optimization mode of Intel MPI Library custom allocator of shared memory.

Syntax

I_MPI_SHM_HEAP_OPT=<mode>

Argument

/p>
<mode>
Optimization mode
rank
In this mode, each rank has its own dedicated amount of shared memory. This is the default value when
I_MPI_SHM_HEAP=1
numa
In this mode, all ranks from NUMA-node use the same amount of shared memory.

Description

It is recommended to use
I_MPI_SHM_HEAP_OPT=rank
when each rank uses the same amount of memory, and
I_MPI_SHM_HEAP_OPT=numa
when ranks use significantly different amounts of memory.
Usually, the
I_MPI_SHM_HEAP_OPT=rank
works faster than
I_MPI_SHM_HEAP_OPT=numa
but the
numa
optimization mode may consume smaller volume of shared memory.
I_MPI_WAIT_MODE
Control the Intel® MPI Library optimization for oversubscription mode.

Syntax

I_MPI_WAIT_MODE=<arg>
A
rgument
<arg>
Binary indicator
0
Optimize MPI application to work in the normal mode (1 rank on 1 CPU). This is the default value if the number of processes on a computation node is less than or equal to the number of CPUs on the node.
1
Optimize MPI application to work in the oversubscription mode (multiple ranks on 1 CPU). This is the default value if the number of processes on a computation node is greater than the number of CPUs on the node.

Description

It is recommended to use this variable in the oversubscription mode.
I_MPI_THREAD_YIELD
Control the Intel® MPI Library thread yield customization during MPI busy wait time.

Syntax

I_MPI_THREAD_YIELD=<arg>
A
rgument
<arg>
Binary indicator
0
Do nothing for thread yield during the busy wait (spin wait). This is the default value when
I_MPI_WAIT_MODE=0
1
Do the
pause
processor instruction for
I_MPI_PAUSE_COUNT
during the busy wait.
2
Do the
shied_yield()
system call for thread yield during the busy wait.
This is the default value when
I_MPI_WAIT_MODE=1
3
Do the
usleep()
system call for
I_MPI_THREAD_SLEEP
number of microseconds for thread yield during the busy wait.

Description

It is recommended to use
I_MPI_THREAD_YIELD=0
or
I_MPI_THREAD_YIELD=1
in the normal mode and
I_MPI_THREAD_YIELD=2
or
I_MPI_THREAD_YIELD=3
in the oversubscription mode.
I_MPI_PAUSE_COUNT
Control the Intel® MPI Library pause count for the thread yield customiza tion during MPI busy wait time.

Syntax

I_MPI_PAUSE_COUNT=<arg>
A
rgument
<arg>
Description
>=0
Pause count for thread yield customization during MPI busy wait time.
The default value is 0. Normally, the value is less than 100.

Description

This variable is applicable when
I_MPI_THREAD_YIELD=1
. Small values of
I_MPI_PAUSE_COUNT
may increase performance, while larger values may reduce energy consumption.
I_MPI_THREAD_SLEEP
Control the Intel® MPI Library thread sleep microseconds timeout for thread yield customization while MPI busy wait progress.

Syntax

I_MPI_THREAD_SLEEP=<arg>
A
rgument
<arg>
Description
>=0
Thread sleep microseconds timeout. The default value is 0. Normally, the value is less than 100.

Description

This variable is applicable when
I_MPI_THREAD_YIELD=3
. Small values of
I_MPI_PAUSE_COUNT
may increase performance in the normal mode, while larger values may increase performance in the oversubscription mode
I_MPI_EXTRA_FILESYSTEM
Control native support for parallel file systems.

Syntax

I_MPI_EXTRA_FILESYSTEM=<arg>

Argument

<arg>
Binary indicator
enable | yes | on | 1
Enable native support for parallel file systems.
disable | no | off | 0
Disable native support for parallel file systems.

Description

Use this environment variable to enable or disable native support for parallel file systems.
I_MPI_EXTRA_FILESYSTEM_FORCE

Description

Force filesystem recognition logic.

Syntax

I_MPI_EXTRA_FILESYSTEM_FORCE=<ufs|nfs|gpfs|panfs|lustre>

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804