Developer Reference

Other Environment Variables

I_MPI_DEBUG

Print out debugging information when an MPI program starts running.
Syntax
I_MPI_DEBUG=
<level>
[,<flags>]
Arguments
<level>
Indicate the level of debug information provided.
0
Output no debugging information. This is the default value.
1
Output libfabric* version and provider.
2
Output information about the tuning file used.
3
Output effective MPI rank,
pid
and node mapping table.
4
Output process pinning information.
5
Output environment variables specific to the Intel® MPI Library.
> 5
Add extra levels of debug information.
<flags>
Comma-separated list of debug flags
pid
Show process id for each debug message.
tid
Show thread id for each debug message for multithreaded library.
time
Show time for each debug message.
datetime
Show time and date for each debug message.
host
Show host name for each debug message.
level
Show level for each debug message.
scope
Show scope for each debug message.
line
Show source line number for each debug message.
file
Show source file name for each debug message.
nofunc
Do not show routine name.
norank
Do not show rank.
nousrwarn
Suppress warnings for improper use case (for example, incompatible combination of controls).
flock
Synchronize debug output from different process or threads.
nobuf
Do not use buffered I/O for debug output.
Description
Set this environment variable to print debugging information about the application.
Set the same
<
level
>
value for all ranks.
You can specify the output file name for debug information by setting the
I_MPI_DEBUG_OUTPUT
environment variable.
Each printed line has the following format:
[<
identifier
>] <
message
>
where:
  • <
    identifier
    >
    is the MPI process rank, by default. If you add the '
    +
    ' sign in front of the
    <
    level
    >
    number, the
    <
    identifier
    >
    assumes the following format:
    rank#pid@hostname
    . Here,
    rank
    is the MPI process rank,
    pid
    is the
    UNIX*
    process ID, and
    hostname
    is the host name. If you add the '
    -
    ' sign,
    <
    identifier
    >
    is not printed at all.
  • <
    message
    >
    contains the debugging output.
The following examples demonstrate possible command lines with the corresponding output:
$ mpirun -n 1 -env I_MPI_DEBUG=2 ./a.out ... [0] MPI startup(): shared memory data transfer mode
The following commands are equal and produce the same output:
$ mpirun -n 1 -env I_MPI_DEBUG=2,pid,host ./a.out ... [0#1986@mpicluster001] MPI startup(): shared memory data transfer mode
Compiling with the
-g
option adds a considerable amount of printed debug information.

I_MPI_DEBUG_OUTPUT

Set output file name for debug information.
Syntax
I_MPI_DEBUG_OUTPUT=
<arg>
Arguments
Argument
String Value
stdout
Output to
stdout
. This is the default value.
stderr
Output to
stderr.
<file_name>
Specify the output file name for debug information (the maximum file name length is 256 symbols).
Description
Set this environment variable if you want to split output of debug information from the output produced by an application. If you use format like
%r
,
%p
or
%h
, rank, process ID or host name is added to the file name accordingly.

I_MPI_DEBUG_COREDUMP

Controls core dump files generation in case of failure during MPI application execution.
Syntax
I_MPI_DEBUG_COREDUMP=
<arg>
Arguments
Argument
Binary Indicator
enable|yes|on|1
Enable coredump files generation.
disable|no|off|0
Do not generate coredump files. Default value.
Description
Set this environment variable to enable coredump files dumping in case of termination caused by segmentation fault. Available for both release and debug builds.

I_MPI_STATS

Collect MPI statistics from your application using Application Performance Snapshot.
Syntax
I_MPI_STATS=<
level
>
Arguments
<
level
>
Indicate the level of statistics collected
1,2,3,4,5
Specify the level to indicate amount of MPI statistics to be collected by Application Performance Snapshot (APS).
The full description of levels is available in the official APS documentation.
Description
Set this variable to collect MPI-related statistics from your MPI application using Application Performance Snapshot. The variable creates a new folder
aps_result_<date>-<time>
containing statistics data. To analyze the collected data, use the
aps
utility. For example:
$ export I_MPI_STATS=5 $ mpirun -n 2 ./myApp $ aps-report aps_result_20171231_235959

I_MPI_STARTUP_MODE

Select a mode for the Intel® MPI Library process startup algorithm.
Syntax
I_MPI_STARTUP_MODE=
<arg>
Arguments
Argument
String Value
p
mi_shm
Use shared memory to reduce the number of PMI calls. This mode is enabled by default.
pmi_shm_netmod
Use the
netmod
infrastructure for address exchange logic in addition to PMI and shared memory.
Description
The
pmi_shm
and
pmi_shm_netmod modes
reduce the application startup time. The efficiency of the modes is more clearly observed with the higher
-ppn value
, while there is no improvement at all with
-ppn 1
.

I_MPI_PMI_LIBRARY

Specify the name to third party implementation of the PMI library.
Syntax
I_MPI_PMI_LIBRARY=
<name>
Arguments
<
name
>
Full name of the third party PMI library
Description
Set
I_MPI_PMI_LIBRARY
to specify the name of third party PMI library. When you set this environment variable, provide full name of the library with full path to it.
Currently supported PMI versions: PMI1, PMI2

I_MPI_PMI_VALUE_LENGTH_MAX

Control the length of the value buffer in PMI on the client side.
Syntax
I_MPI_PMI_VALUE_LENGTH_MAX=<
length
>
Arguments
<
length
>
Define the value of the buffer length in bytes.
<
n
> > 0
 The default value is -1, which means do not override the value received from the
PMI_KVS_Get_value_length_max()
function.
Description
Set this environment variable to control the length of the value buffer in PMI on the client side. The length of the buffer will be the lesser of
I_MPI_PMI_VALUE_LENGTH_MAX
and
PMI_KVS_Get_value_length_max()
.

I_MPI_OUTPUT_CHUNK_SIZE

Set the size of the
stdout/stderr
output buffer.
Syntax
I_MPI_OUTPUT_CHUNK_SIZE=
<size>
Arguments
<size>
Define output chunk size in kilobytes
<n>
> 0
The default chunk size value is 1 KB
Description
Set this environment variable to increase the size of the buffer used to intercept the standard output and standard error streams from the processes. If the
<size>
value is not greater than zero, the environment variable setting is ignored and a warning message is displayed.
Use this setting for applications that create a significant amount of output from different processes. With the
-ordered-output option of
mpiexec.hydra, this setting helps to prevent the output from garbling.
Set the
I_MPI_OUTPUT_CHUNK_SIZE
environment variable in the shell environment before executing the
mpiexec.hydra/mpirun command
. Do not use the
-genv
or
-env
options for setting the
<size>
value. Those options are used only for passing environment variables to the MPI process environment.

I_MPI_REMOVED_VAR_WARNING

Print out a warning if a removed environment variable is set.
Syntax
I_MPI_REMOVED_VAR_WARNING=
<arg>
Arguments
Argument
Binary Indicator
enable | yes | on | 1
Print out the warning. This is the default value
disable | no | off | 0
Do not print the warning
Description
Use this environment variable to print out a warning if a removed environment variable is set. Warnings are printed regardless of whether
I_MPI_DEBUG is set.

I_MPI_VAR_CHECK_SPELLING

Print out a warning if an unknown environment variable is set.
Syntax
I_MPI_VAR_CHECK_SPELLING=
<arg>
Arguments
Argument
Binary Indicator
enable | yes | on | 1
Print out the warning. This is the default value
disable | no | off | 0
Do not print the warning
Description
Use this environment variable to print out a warning if an unsupported environment variable is set. Warnings are printed in case of removed or misprinted environment variables.

I_MPI_LIBRARY_KIND

Specify the Intel® MPI Library configuration.
Syntax
I_MPI_LIBRARY_KIND=
<value>
Arguments
Value
Description
release
Multi-threaded optimized library
(with the global lock)
. This is the default value
debug
Multi-threaded debug library
(with the global lock)
release_mt
Multi-threaded optimized library (with per-object lock for the thread-split model)
debug_mt
Multi-threaded debug library (with per-object lock for the thread-split model)
Description
Use this variable to set an argument for the
vars.
[c]sh
script. This script establishes the Intel® MPI Library environment and enables you to specify the appropriate library configuration. To ensure that the desired configuration is set, check the
LD_LIBRARY_PATH
variable.
Example
$ export I_MPI_LIBRARY_KIND=debug
Setting this variable is equivalent to passing an argument directly to the
vars.[c]sh
script:
Example
$ . <installdir>/bin/vars.sh release

I_MPI_PLATFORM

Select the intended optimization platform.
Syntax
I_MPI_PLATFORM=
<platform>
Arguments
<platform>
Intended optimization platform (string value)
auto
Use only with heterogeneous runs to determine the appropriate platform across all nodes. May slow down MPI initialization time due to collective operation across all nodes.
ivb
Optimize for the Intel® Xeon® Processors E3, E5, and E7 V2 series and other Intel® Architecture processors formerly code named Ivy Bridge.
hsw
Optimize for the Intel Xeon Processors E3, E5, and E7 V3 series and other Intel® Architecture processors formerly code named Haswell.
bdw
Optimize for the Intel Xeon Processors E3, E5, and E7 V4 series and other Intel Architecture processors formerly code named Broadwell.
knl
Optimize for the Intel® Xeon Phi™ processor and coprocessor formerly code named Knights Landing.
skx
Optimize for the Intel Xeon Processors E3 V5 and Intel Xeon Scalable Family series, and other Intel Architecture processors formerly code named Skylake.
clx
Optimize for the 2nd Generation Intel Xeon Scalable Processors, and other Intel® Architecture processors formerly code named Cascade Lake.
clx-ap
Optimize for the 2nd Generation Intel Xeon Scalable Processors, and other Intel Architecture processors formerly code named Cascade Lake AP
Note:
The explicit
clx-ap
setting is ignored if the actual platform is not Intel.
Description
Set this environment variable to use the predefined platform settings. The default value is a local platform for each node.
The variable is available for both Intel and non-Intel microprocessors, but it may utilize additional optimizations for Intel microprocessors than it utilizes for non-Intel microprocessors.
The values
auto[:min]
,
auto:max
, and
auto:most
may increase the MPI job startup time.

I_MPI_MALLOC

Control the Intel® MPI Library custom allocator of private memory.
Syntax
I_MPI_MALLOC=
<arg>
Argument
Argument
Binary Indicator
1
Enable the Intel MPI Library custom allocator of private memory.
Use the Intel MPI custom allocator of private memory for
MPI_Alloc_mem/MPI_Free_mem
.
0
Disable the Intel MPI Library custom allocator of private memory.
Use the system-provided memory allocator for
MPI_Alloc_mem/MPI_Free_mem
.
Description
Use this environment variable to enable or disable the Intel MPI Library custom allocator of private memory for
MPI_Alloc_mem/MPI_Free_mem
.
By default,
I_MPI_MALLOC
is enabled for
release
and
debug
Intel MPI Library configurations and disabled for
release_mt
sand
debug_mt
configurations.
If the platform is not supported by the Intel MPI Library custom allocator of private memory, a system-provided memory allocator is used and the
I_MPI_MALLOC variable is ignored.

I_MPI_SHM_HEAP

Control the Intel® MPI Library custom allocator of shared memory.
Syntax
I_MPI_SHM_HEAP=
<arg>
Argument
Argument
Binary Indicator
1
Use the Intel MPI custom allocator of shared memory for
MPI_Alloc_mem/MPI_Free_mem
.
0
Do not use the Intel MPI custom allocator of shared memory for
MPI_Alloc_mem/MPI_Free_mem
.
Description
Use this environment variable to enable or disable the Intel MPI Library custom allocator of shared memory for
MPI_Alloc_mem/MPI_Free_mem.
By default,
I_MPI_SHM_HEAP
is disabled. If enabled, it can improve performance of the shared memory transport because in that case it is possible to make only one memory copy operation instead of two copy-in/copy-out memory copy operations. If both
I_MPI_SHM_HEAP
and
I_MPI_MALLOC
are enabled, the shared memory allocator is used first. The private memory allocator is used only when required volume of shared memory is not available.
Details
By default, the shared memory segment is allocated on
tmpfs
file system on the
/dev/shm/
mount point. Starting from Linux kernel 4.7, it is possible to enable transparent huge pages on the shared memory. If Intel MPI Library shared memory heap is used, it is recommended to enable transparent huge pages on your system. To enable transparent huge pages on
/dev/shm
, please contact your system administrator or execute the following command:
sudo mount -o remount,huge=advise /dev/shm
In order to use another
tmpfs
mount point instead of
/dev/shm/
, use
I_MPI_SHM_FILE_PREFIX_4K
,
I_MPI_SH M_FILE_PREFIX_2M
, and
I_MPI_SHM_FILE_PREFIX_1G
.
If your application does not use
MPI_Alloc_mem/MPI_Free_mem
directly, you can override standard
malloc/calloc/realloc/free
procedures by preloading the
libmpi_shm_heap_proxy.so
library:
export LD_PRELOAD=$I_MPI_ROOT/lib/libmpi_shm_heap_proxy.so export I_MPI_SHM_HEAP=1
In this case, the
malloc/calloc/realloc
is a proxy for
MPI_Alloc_mem
and
free
is a proxy for
MPI_Free_mem
.
If the platform is not supported by the Intel MPI Library custom allocator of shared memory, the
I_MPI_SHM_HEAP
variable is ignored.

I_MPI_SHM_HEAP_VSIZE

Change the size (per rank) of virtual shared memory available for the Intel MPI Library custom allocator of shared memory.
Syntax
I_MPI_SHM_HEAP_VSIZE=
<size>
Argument
<size>
The size (per rank) of shared memory used in shared memory heap (in megabytes).
>0
If shared memory heap is enabled for
MPI_Alloc_mem/MPI_Free_mem
, the default value is
4096
.
Description
Intel MPI Library custom allocator of shared memory works with fixed size virtual shared memory. The shared memory segment is allocated on
MPI_Init
and cannot be enlarged later.
The
I_MPI_SHM_HEAP_VSIZE=0
completely disables the Intel MPI Library shared memory allocator.

I_MPI_SHM_HEAP_CSIZE

Change the size (per rank) of shared memory cached in the Intel MPI Library custom allocator of shared memory.
Syntax
I_MPI_SHM_HEAP_CSIZE=
<size>
Argument
<size>
The size (per rank) of shared memory used in Intel MPI Library shared memory allocator (in megabytes).
>0
It depends on the available shared memory size and number of ranks. Normally, the size is less than
256
.
Description
Small values of
I_MPI_SHM_HEAP_CSIZE may reduce overall shared memory consumption. Larger values of this variable may speed up
MPI_Alloc_mem/MPI_Free_mem.

I_MPI_SHM_HEAP_OPT

Change the optimization mode of Intel MPI Library custom allocator of shared memory.
Syntax
I_MPI_SHM_HEAP_OPT=
<mode>
Argument
Mode
Optimization Mode
rank
In this mode, each rank has its own dedicated amount of shared memory. This is the default value when
I_MPI_SHM_HEAP=1
numa
In this mode, all ranks from NUMA-node use the same amount of shared memory.
Description
It is recommended to use
I_MPI_SHM_HEAP_OPT=rank
when each rank uses the same amount of memory, and
I_MPI_SHM_HEAP_OPT=numa
when ranks use significantly different amounts of memory.
Usually, the
I_MPI_SHM_HEAP_OPT=rank
works faster than
I_MPI_SHM_HEAP_OPT=numa
but the
numa
optimization mode may consume smaller volume of shared memory.

I_MPI_WAIT_MODE

Control the Intel® MPI Library optimization for oversubscription mode.
Syntax
I_MPI_WAIT_MODE=
<arg>
Arguments
Argument
Binary Indicator
0
Optimize MPI application to work in the normal mode (1 rank on 1 CPU). This is the default value if the number of processes on a computation node is less than or equal to the number of CPUs on the node.
1
Optimize MPI application to work in the oversubscription mode (multiple ranks on 1 CPU). This is the default value if the number of processes on a computation node is greater than the number of CPUs on the node.
Description
It is recommended to use this variable in the oversubscription mode.

I_MPI_THREAD_YIELD

Control the Intel® MPI Library thread yield customization during MPI busy wait time.
Syntax
I_MPI_THREAD_YIELD=
<arg>
Arguments
Argument
Description
0
Do nothing for thread yield during the busy wait (spin wait). This is the default value when
I_MPI_WAIT_MODE=0
1
Do the
pause processor
instruction for
I_MPI_PAUSE_COUNT
during the busy wait.
2
Do the
shied_yield()
system call for thread yield during the busy wait.
This is the default value when
I_MPI_WAIT_MODE=1
3
Do the
usleep()
system call for
I_MPI_THREAD_SLEEP
number of microseconds for thread yield during the busy wait.
Description
I_MPI_THREAD_YIELD=0
or
I_MPI_THREAD_YIELD=1
in the normal mode and
I_MPI_THREAD_YIELD=2
or
I_MPI_THREAD_YIELD=3
in the oversubscription mode.

I_MPI_PAUSE_COUNT

Control the Intel® MPI Library pause count for the thread yield customization during MPI busy wait time.
Syntax
I_MPI_PAUSE_COUNT=
<arg>
Argument
Argument
Description
>=0
Pause count for thread yield customization during MPI busy wait time.
The default value is 0. Normally, the value is less than 100.
Description
This variable is applicable when
I_MPI_THREAD_YIELD=1
. Small values of
I_MPI_PAUSE_COUNT
may increase performance, while larger values may reduce energy consumption.

I_MPI_SPIN_COUNT

Control the spin count value.
Syntax
I_MPI_SPIN_COUNT=
<scount>
Argument
<scount>
Define the loop spin count when polling fabric(s).
>=0
The default <scount> value is equal to 1 when more than one process runs per processor/core. Otherwise the value equals 2000. The maximum value is equal to 2147483647.
Description
Set the spin count limit. The loop for polling the fabric(s) spins
<scount>
times before the library releases the processes if no incoming messages are received for processing. Smaller values for
<scount>
cause the Intel® MPI Library to release the processor more frequently.
Use the
I_MPI_SPIN_COUNT
environment variable for tuning application performance. The best value for
<scount>
can be chosen on an experimental basis. It depends on the particular computational environment and application.

I_MPI_THREAD_SLEEP

Control the Intel® MPI Library thread sleep
microseconds
timeout for thread yield customization while MPI busy wait progress.
Syntax
I_MPI_THREAD_SLEEP=
<arg>
Argument
Argument
Description
>=0
Thread sleep microseconds timeout. The default value is 0. Normally, the value is less than 100.
Description
This variable is applicable when
I_MPI_THREAD_YIELD=3
. Small values of
I_MPI_PAUSE_COUNT
may increase performance in the normal mode, while larger values may increase performance in the oversubscription mode

I_MPI_EXTRA_FILESYSTEM

Control native support for parallel file systems.
Syntax
I_MPI_EXTRA_FILESYSTEM=
<arg>
Argument
Argument
Binary Indicator
enable | yes | on | 1
Enable native support for parallel file systems.
disable | no | off | 0
Disable native support for parallel file systems. This is the default value.
Description
Use this environment variable to enable or disable native support for parallel file systems.

I_MPI_EXTRA_FILESYSTEM_FORCE

Syntax
I_MPI_EXTRA_FILESYSTEM_FORCE=<ufs|nfs|gpfs|panfs|lustre
|daos
>
Description
Force filesystem recognition logic. Setting this variable is equivalent to prefixing all paths in MPI-IO calls with the selected filesystem plus colon.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.