Developer Guide and Reference

Contents

Thread Affinity Interface (Linux* and Windows*)

The Intel® runtime library has the ability to bind OpenMP* threads to physical processing units. The interface is controlled using the
KMP_AFFINITY
environment variable. Depending on the system (machine) topology, application, and operating system, thread affinity can have a dramatic effect on the application speed.
Thread affinity
restricts execution of certain threads (virtual execution units) to a subset of the physical processing units in a multiprocessor computer. Depending upon the topology of the machine, thread affinity can have a dramatic effect on the execution speed of a program.
Thread affinity is supported on Windows* systems and versions of Linux* systems that have kernel support for thread affinity
, but is not supported by
macOS*
.
The Intel OpenMP runtime library has the ability to bind OpenMP* threads to physical processing units. There are three types of interfaces you can use to specify this binding, which are collectively referred to as the Intel OpenMP Thread Affinity Interface:
  • The high-level affinity interface uses an environment variable to determine the machine topology and assigns OpenMP* threads to the processors based upon their physical location in the machine. This interface is controlled entirely by the
    KMP_AFFINITY
    environment variable
    .
  • The mid-level affinity interface uses an environment variable to explicitly specifies which processors (labeled with integer IDs) are bound to OpenMP* threads. This interface provides compatibility with the gcc*
    GOMP_AFFINITY
    environment variable, but you can also invoke it by using the
    KMP_AFFINITY
    environment variable. The
    GOMP_AFFINITY
    environment variable is supported on Linux* systems only, but users on Windows* or Linux* systems can use the similar functionality provided by the
    KMP_AFFINITY
    environment variable.
  • The low-level affinity interface uses APIs to enable OpenMP* threads to make calls into the OpenMP* runtime library to explicitly specify the set of processors on which they are to be run. This interface is similar in nature to
    sched_setaffinity
    and related functions on Linux* systems or to
    SetThreadAffinityMask
    and related functions on Windows* systems. In addition, you can specify certain options of the
    KMP_AFFINITY
    environment variable to affect the behavior of the low-level API interface. For example, you can set the affinity type
    KMP_AFFINITY
    to disabled, which disables the low-level affinity interface, or you could use the
    KMP_AFFINITY
    or
    GOMP_AFFINITY
    environment variables to set the initial affinity mask, and then retrieve the mask with the low-level API interface.
The following terms are used in this section:
  • The total number of processing elements on the machine is referred to as the number of
    OS thread contexts.
  • Each processing element is referred to as an Operating System processor, or
    O
    S proc
    .
  • Each OS processor has a unique integer identifier associated with it, called an
    OS proc ID
    .
  • The term
    package
    refers to a single or multi-core processor chip.
  • The term
    OpenMP* Global Thread ID
    (GTID) refers to an integer which uniquely identifies all threads known to the Intel OpenMP runtime library. The thread that first initializes the library is given GTID 0. In the normal case where all other threads are created by the library and when there is no nested parallelism, then
    n-threads-var
    - 1 new threads are created with GTIDs ranging from 1 to
    ntheads-var
    - 1, and each thread's GTID is equal to the OpenMP* thread number returned by function
    omp_get_thread_num
    (). The high-level and mid-level interfaces rely heavily on this concept. Hence, their usefulness is limited in programs containing nested parallelism. The low-level interface does not make use of the concept of a GTID, and can be used by programs containing arbitrarily many levels of parallelism.
Some environment variables are available for both Intel® microprocessors and non-Intel microprocessors, but may perform additional optimizations for Intel® microprocessors than for non-Intel microprocessors.

The
KMP_AFFINITY
Environment Variable

You must set the
KMP_AFFINITY
environment variable before the first parallel region, or certain API calls including
omp_get_max_threads()
,
omp_get_num_procs()
and any affinity API calls, as described in Low Level Affinity API, below.
The
KMP_AFFINITY
environment variable uses the following general syntax:
Syntax
KMP_AFFINITY=[<
modifier
>,...]<
type
>[,<
permute
>][,<
offset
>]
For example, to list a machine topology map, specify
KMP_AFFINITY=verbose,none
to use a
modifier
of
verbose
and a
type
of
none
.
The following table describes the supported specific arguments.
Argument
Default
Description
noverbose
respect
granularity=core
Optional. String consisting of keyword and specifier.
  • granularity=<
    specifier
    >
    takes the following specifiers:
    fine
    ,
    thread
    , core, and tile
  • norespect
  • noverbose
  • nowarnings
  • proclist={<
    proc-list
    >}
  • respect
  • verbose
  • warnings
The syntax for
<
proc-list
>
is explained in mid-level affinity interface.
On Windows* with multiple processor groups, the norespect affinity modifier is assumed when the process affinity mask equals a single processor group (which is default on Windows*). Otherwise, the respect affinity modifier is used.
none
Required string. Indicates the thread affinity to use.
  • balanced
  • compact
  • disabled
  • explicit
  • none
  • scatter
  • logical
    (deprecated; instead use
    compact
    , but omit any
    permute value
    )
  • physical
    (deprecated; instead use
    scatter,
    possibly with an
    offset value
    )
The
logical
and
physical
types are deprecated but supported for backward compatibility.
0
Optional. Positive integer value. Not valid with type values of
explicit
,
none
, or
disabled
.