Developer Reference

  • 2020 Update 2
  • 07/15/2020
  • Public Content

mpitune Configuration Options

Application Options

-app
Sets a template for the command line to be launched to gather tuning results. The command line can contain variables declared as
@<var_name>@
. The variables are defined further on using other options.
For example:
-app: mpirun -np @np@ -ppn @ppn@ IMB-MPI1 -msglog 0:@logmax@ -npmin @np@ @func@
The application must produce output (in
stdout
or file or any other destination) that can be parsed by the tuner to pick the value to be tuned and other variables. See the
-app-regex
and
-app-regex-legend
options below for details.
-app-regex
Sets a regular expression to be evaluated to extract the required values from the application output. Use regular expression groups to assign the values to variables. Variables and groups associations are set using the
-app-regex-legend
option.
For example, to extract the
#bytes
and
t_max[usec]
values from this output:
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 0.06 0.06 0.06 1 1000 0.10 0.10 0.10
use the following configuration:
-app-regex: (\d+)\s+\d+\s+[\d.+-]+\s+([\d.+-]+)
-app-regex-legend
Specifies a list of variables extracted from the regular expression. Variables correspond to the regular expression groups. The tuner uses the last variable as the performance indicator of the launch. Use the
-tree-opt
to set the optimization direction of the indicator.
For example:
-app-regex-legend: size,time
-iter
Sets the number of iterations for each launch with a given set of parameters. Higher numbers of iterations increase accuracy of results.
For example:
-iter: 3

Search Space Options

Use these options to define a search space, which is a set of combinations of Intel® MPI Library parameters that the target application uses for launches. The library parameters are generally configured using run-time options or environment variables.
A search space line can be very long, so line breaking is available for all the search space options. Use a backslash to break a line (see examples below).
-search
Defines the search space by defining variables declared with the
-app
option and by setting environment variables for the application launch.
For example:
-search: func=BCAST, \ np=4,ppn={1,4},{,I_MPI_ADJUST_BCAST=[1,3]},logmax=5
The
-app
variables are defined as
<var1>=<value1>[,<var2>=<value2>][,...]
. The following syntax is available for setting values:
Syntax
Description
Examples
<value>
Single value. Can be a number or a string.
4
{<value1>[,<value2>][,...]}
List of independent values.
{2,4}
[<start>,<end>[,<step>]]
Linear range of values with the default step of
1
.
[1,8,2]
— expands to
{1,2,4,6,8}
(<start>,<end>[,<step>])
Exponential range with the default step of
2
.
(1,16)
— expands to
{1,2,4,8,16}
To set environment variables for the command launch, use the following syntax:
Syntax
Description
Examples
<variable>=<value>
Single variable definition. Any type of the syntax above can be used for the value: single values, lists or ranges.
I_MPI_ADJUST_BCAST=3
I_MPI_ADJUST_BCAST=[1,3]
{,<variable>=<value>}
A special case of the syntax above. When set this way, the variable default value is first used in an application launch.
{,I_MPI_ADJUST_BCAST=[1,3]}
<prefix>{<value1>
[,<value2>][,...]}
Multi-value variable definition.
Prefix is a common part for all the values, commonly the variable name.
A value can be a singular value or a combination of values in the format:
<prefix>(<value1>[,<value2>][,...])
. Prefix is optional and a value in the combination is a string, which can utilize the list and range syntax above.
I_MPI_ADJUST_ALLREDUCE{=1,
=2,(=9,_KN_RADIX=(2,8))}
See below for a more complete example.
The following example shows a more complex option definition:
I_MPI_ADJUST_BCAST{=1,=2,(=9,_KN_RADIX=(2,8)),(={10,11},_SHM_KN_RADIX=[2,8,2])}
This directive consecutively runs the target application with the following environment variables set:
I_MPI_ADJUST_BCAST=1 I_MPI_ADJUST_BCAST=2 I_MPI_ADJUST_BCAST=9,I_MPI_ADJUST_BCAST_KN_RADIX=2 I_MPI_ADJUST_BCAST=9,I_MPI_ADJUST_BCAST_KN_RADIX=4 I_MPI_ADJUST_BCAST=9,I_MPI_ADJUST_BCAST_KN_RADIX=8 I_MPI_ADJUST_BCAST=10,I_MPI_ADJUST_BCAST_SHM_KN_RADIX=2 I_MPI_ADJUST_BCAST=10,I_MPI_ADJUST_BCAST_SHM_KN_RADIX=4 I_MPI_ADJUST_BCAST=10,I_MPI_ADJUST_BCAST_SHM_KN_RADIX=6 I_MPI_ADJUST_BCAST=10,I_MPI_ADJUST_BCAST_SHM_KN_RADIX=8 I_MPI_ADJUST_BCAST=11,I_MPI_ADJUST_BCAST_SHM_KN_RADIX=2 I_MPI_ADJUST_BCAST=11,I_MPI_ADJUST_BCAST_SHM_KN_RADIX=4 I_MPI_ADJUST_BCAST=11,I_MPI_ADJUST_BCAST_SHM_KN_RADIX=6 I_MPI_ADJUST_BCAST=11,I_MPI_ADJUST_BCAST_SHM_KN_RADIX=8
-search-excl
Excludes certain combinations from the search space. The syntax is identical to that of the
-search
option. For example:
-search-excl: I_MPI_ADJUST_BCAST={1,2}
or
-search-excl: func=BCAST,np=4,ppn=1,I_MPI_ADJUST_BCAST=1
-search-only
Defines a subset of the search space to search in. Only this subset is used for application launches. The syntax is identical to the
-search
option.
This option is useful for the second and subsequent tuning sessions on a subset of parameters from the original session, without creating a separate configuration file.

Output Options

Use these options to customize the output. The tuner can produce output of two types:
  • table— useful for verifying the tuning results, contains values from all the application launches
  • tree— an internal output format, contains the optimal values
-table
Defines the layout for the resulting output table. The option value is a list of variables declared with the
-app
option, which are joined in colon-separated groups. Each group denotes a specific part of the table.
For example:
-table: func:ppn,np:size:*:time
The last group variables (
time
) are rendered in table cells. The second last group variables are used for building table columns (
*
, denotes all the variables not present the other variable groups). The third last group variables are used for building table rows (
size
). All other variable groups are used to make up the table label. Groups containing several variables are complex groups and produce output based on all the value combinations.
For example, the option definition above can produce the following output:
Label: "func=BCAST,ppn=2,np=2" Legend: set 0: "" set 1: "I_MPI_ADJUST_BCAST=1" set 2: "I_MPI_ADJUST_BCAST=2" set 3: "I_MPI_ADJUST_BCAST=3" Table: | set 0 | set 1 | set 2 | set 3 -----------|-------------|-------------|-------------|------------ "size=0" | "time=0.10" | "time=0.08" | "time=0.11" | "time=0.10" | "time=0.12" | "time=0.09" | "time=0.12" | "time=0.11" | | "time=0.10" | | -----------|-------------|-------------|-------------|------------ "size=4" | "time=1.12" | "time=1.11" | "time=1.94" | "time=1.72" | "time=1.35" | "time=1.18" | "time=1.97" | "time=1.81" | "time=1.38" | "time=1.23" | "time=2.11" | "time=1.89" -----------|-------------|-------------|-------------|------------ "size=8" | "time=1.21" | "time=1.10" | "time=1.92" | "time=1.72" | "time=1.36" | "time=1.16" | "time=2.01" | "time=1.75" | "time=1.37" | "time=1.17" | "time=2.24" | "time=1.87" -----------|-------------|-------------|-------------|------------ ...
Cells include only unique values from all the launches for the given parameter combination. The number of launches is set with the
-iter
option.
-table-ignore
Specifies the variables to ignore from the
-table
option definition.
-tree
Defines the layout for the resulting tree of optimal values of the parameter that is tuned (for example, collective operation algorithms). The tree is rendered as a JSON structure. The option value is a list of variables declared with the
-app
option, which are joined in colon-separated groups. Each group denotes a specific part of the tree. Groups containing several variables are complex groups and produce output based on all the value combinations.
Example:
-tree: func:ppn,np:size:*:time
The first two groups (
func
and
ppn,np
) make up the first two levels of the tree. The last group variables (
time
) are used as the optimization criteria and are not rendered. The second last group contains variables to be optimized (
*
, denotes all the variables not present the other variable groups). The third last group variables are used to split the search space into intervals based on the optimal values of parameters from the next group (for example,
I_MPI_ADJUST_<operation>
algorithm numbers).
For example, the option definition above can produce the following output:
{ "func=BCAST": { "ppn=1,np=4": { "size=0": {"I_MPI_ADJUST_BCAST": "3"}, "size=64": {"I_MPI_ADJUST_BCAST": "1"}, "size=512": {"I_MPI_ADJUST_BCAST": "2"}, ... } } }
This tree representation is an intermediate format of tuning results and is ultimately converted to a string that the library can understand. The conversion script is specified with
-tree-postprocess
option.
-tree-ignore
Specifies the variables to ignore from the
-tree
option definition.
-tree-intervals
Specifies the maximum number of intervals where the optimal parameter value is applied. If not specified, any number of intervals is allowed.
-tree-tolerance
Specifies the tolerance level. Non-zero tolerance (for example, 0.03 for 3%) joins resulting intervals with the performance indicator value varying by the specified tolerance.
-tree-postprocess
Specifies an executable to convert the resulting JSON tree to a custom format.
-tree-opt
Specifies the optimization direction. The available values are
max
(default) and
min
.
-tree-file
Specifies a log file where the tuning results are saved.
-tree-view
Specify the mode to present the json-tree. The available values are “simple” and “default”. The “default” mode enables an interpolation mechanism; the “simple” mode disables the interpolation mechanism. The resulting tree contains message sizes used during the launch.
-mode
Specifies the mpitune mode. The available values are “collect” for gathering data and “analyze” for converting this data to a JSON-tree. Note that the -mode field can be defined in the configuration file as macros @-mode@, although the real value must be defined in the command line.
-dump-file
Specifies the path for the dump-file, which is returned by mpitune after the first iteration. The first iteration can be initialized by way of “” (an nempty string). Note that the -dump-file field can be defined in the configuration file as macros @-dump-file@, although the real value must be defined in the command line.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804