Main ThreadProcess Pinning
Use this feature to pin a particular MPI
process
to a corresponding set of CPUs within a
node and avoid undesired
process
migration. This feature is available on
operating systems that provide the necessary kernel interfaces.This page describes the pinning process. You can
simulate your pinning configuration using the Pinning Simulator for
Intel MPI Library.
Processor Identification
The following schemes are used to identify logical processors in a system:
- System-defined logical enumeration
- Topological enumeration based on three-level hierarchical identification through triplets (package/socket, core, thread)
The number of a logical CPU is defined as the corresponding position
of
this CPU bit in the kernel affinity bit-mask. Use the cpuinfo
utility, provided with your Intel MPI Library installation or the
cat /proc/cpuinfo
command
to find out the logical CPU
numbers.The three-level hierarchical identification uses
triplets that provide information about processor location and their order.
The triplets are hierarchically ordered (package, core, and thread).
See the example for one possible processor numbering
where there are
two sockets, four cores (two cores per socket), and
eight logical processors (two processors per core).
Note
Logical and topological enumerations are not the
same.
0 | 4 | 1 | 5 | 2 | 6 | 3 | 7 |
Socket | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
Core | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
Thread | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
Use the
cpuinfo
utility to
identify the correspondence between the logical and topological enumerations.
See Processor Information
Utility for more details.
Default Settings
If you do not specify values for any
process
pinning environment variables, the
default settings below are used. For details about these settings, see Environment
Variables and Interoperability with OpenMP API.
- I_MPI_PIN=on
- I_MPI_PIN_MODE=pm
- I_MPI_PIN_RESPECT_CPUSET=on
- I_MPI_PIN_RESPECT_HCA=on
- I_MPI_PIN_CELL=unit
- I_MPI_PIN_DOMAIN=auto:compact
- I_MPI_PIN_ORDER=compact
If I_MPI_PIN_ORDER
is not specified and
one of the sockets (NUMA-nodes) is not used, for better
performance the 'bunch' order will automatically be used instead
of the default ‘compact’ order.
If hyperthreading is on, the number or
processes on the node is greater than the number of cores and no
one process pinning environment variable is set. For better
performance, the "spread" order will automatically be used
instead of the default "compact" order.