Controlling Thread Allocation
KMP_AFFINITYenvironment variables allow you to control how the OpenMP* runtime uses the hardware threads on the processors. These environment variables allow you to try different thread distributions on the cores of the processors and determine how these threads are bound to the cores. You can use the environment variables to work out what is optimal for your application.
KMP_HW_SUBSETvariable controls the allocation of hardware resources and the
KMP_AFFINITYvariable controls how the OpenMP threads are bound to those resources.
Controlling Thread Distribution
KMP_HW_SUBSETvariable controls the hardware resource that will be used by the program. This variable specifies the number of sockets to use, how many cores to use per socket and how many threads to assign per core. While specifying two threads per core often yields better performance than one thread per core, specifying three or four threads per core may or may not improve the performance. This variable enables you to conveniently measure the performance of up to four threads per core.
For example, you can determine the effects of assigning 24, 48, 72, or the maximum 96 OpenMP threads in a system with 24 cores by specifying the following variable settings:
To Assign This Number of Threads ...
... Use This Setting
Take care when using the
OMP_NUM_THREADSvariable along with this variable. Using the
OMP_NUM_THREADSvariable can result in over or under subscription.
Controlling Thread Bindings
KMP_AFFINITYvariable controls how the OpenMP threads are bound to the hardware resources allocated by the
KMP_HW_SUBSETvariable. While this variable can be set to several binding or affinity types, the following are the recommended affinity types to use to run your OpenMP threads on the processor: