Intel thread affinity environment variable for OpenMP*

Problem : 

I am using the Intel® compiler and openMP* to parallelize some work as follows:

#pragma omp parallel num_threads(4)
{
DoWork();
}

I would d like to experiment with Intel's thread affinity feature for OpenMP but I a unsure of exactly how to use the environment variable (for example, where do I place the environment statement in my code?).

Could you give a little example showing proper usage and where to place the statement ?


Environment : 
Intel C++ compiler, Linux*, Mac OS* X, Windows*

Resolution : 


Intel thread affinity environment variable KMP_AFFINITY for openMP is explained in Compiler Intel® compiler user guide topic “Thread Affinity Interface (Linux* and Windows*)”. You may review the topic for detailed information.

There are many ways to set the thread affinity using KMP_AFFINITY. I am providing an example below. In the example I am mapping OpenMP threads 0, 1, 2 and 3 to OS processor 3, 2, 1 and 0 respectively.

$ cat tstcase.cpp
// tstcase.cpp
// Thread afinity interface
//


#include <omp.h>
#include <iostream>
using namespace std;

#define N 1000

int main ()
{
cout << "Starting openmp test." << endl;

int i;
float a[N], b[N], c[N], d[N];

/* Some initializations */
for (i=0; i < N; i++)
{
a[i] = i * 1.5;
b[i] = i + 22.35;
}

#pragma omp parallel shared(a,b,c,d) private(i)
{

#pragma omp sections nowait
{

// Section #1
//
#pragma omp section
{
cout << "Section #1." << endl;
for (i=0; i < N; i++)
c[i] = a[i] + b[i]; // End of section #1
a[0] = 3.45;
}

// Section #2
//
#pragma omp section
{
cout << "Section #2." << endl;
for (i=0; i < N; i++)
d[i] = a[i] * b[i]; // End of section #2
a[1] = 5.67;
}

} // End of sections

} // End of parallel

cout << "Exiting openmp test." << endl;
return 0;
}

$ icc -openmp tstcase.cpp
$ export KMP_AFFINITY="verbose,proclist=[3,2,1,0]"
$ ./a.out
OMP: Warning #63: KMP_AFFINITY: proclist specified, setting affinity type to "explicit".
Starting openmp test.
OMP: Info #149: KMP_AFFINITY: Affinity capable, using global cpuid instr info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: {0,1,2,3}
OMP: Info #156: KMP_AFFINITY: 4 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #159: KMP_AFFINITY: 2 packages x 2 cores/pkg x 1 threads/core (4 total cores)
OMP: Info #160: KMP_AFFINITY: OS proc to physical thread map ([] => level not in map):
OMP: Info #168: KMP_AFFINITY: OS proc 0 maps to package 0 core 0 [thread 0]
OMP: Info #168: KMP_AFFINITY: OS proc 2 maps to package 0 core 1 [thread 0]
OMP: Info #168: KMP_AFFINITY: OS proc 1 maps to package 3 core 0 [thread 0]
OMP: Info #168: KMP_AFFINITY: OS proc 3 maps to package 3 core 1 [thread 0]
OMP: Info #147: KMP_AFFINITY: Internal thread 0 bound to OS proc set {3}
OMP: Info #147: KMP_AFFINITY: Internal thread 1 bound to OS proc set {2}
OMP: Info #147: KMP_AFFINITY: Internal thread 2 bound to OS proc set {1}
OMP: Info #147: KMP_AFFINITY: Internal thread 3 bound to OS proc set {0}
Section #1.
Section #2.
Exiting openmp test.



For more complete information about compiler optimizations, see our Optimization Notice.

Comments

Thomas Willhalm (Intel)'s picture

A detailled description of KMP_AFFINITY can be found here:
http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/2011Update/cpp/lin/optaps/common/optaps_openmp_thread_affinity.htm