Control OpenMP thread affinity with ippSetAffinity function

Thread affinity restricts execution of certain threads to a subset of the processor cores in a Multicore computer. Thread affinity can have an important effect on the execution speed of a program in a Multicore system. 

For example, at the system that has several packages, binding Intel IPP function threads to specific processor cores can help to reduce the overhead of cache data communication and cache line invalidation. Moreover, IPP functions execute at high efficiency, consuming most of the available processor resources and performing identical operations on each thread. At the system with Intel Hyper-Threading Technology (Intel HT Technology), it is better to have Intel IPP threads run on separate cores.

The OpenMP* runtime library provides environments and APIs to bind OpenMP threads to physical processing units. But in order to maximize the performance of Intel IPP threaded functions, users need to set different values for different processors. For example, at the system with Intel® HT Technology, users need the following environment: 

KMP_AFFINITY=granularity=fine,compact,1,0

While at the system that does not include Intel® HT Technology, users need to set the environment as follows:

KMP_AFFINITY= compact

Intel IPP 7.0 introduced ippSetAffinity function, which uses OpenMP* APIs to set affinity for a number of threads. This function provides an easy for user to control threading affinity:

IppStatus ippSetAffinity(IppAffinityTypeaType, intoffset)

aType is the type of affinity settings. The the possible values includes ippAffinityCompactFineCore, ippAffinityCompactFineHT, etc

When aType is ippAffinityCompactFineCore, the function bounds the adjacent OpenMP* threads to the adjacent processrs cores. If users want Intel IPP threaded functions take full usage of processors, users are recommended to set aType as that value.  For other values on aType setting, please check the description of ippSetAffinity function in Intel IPP manual

The following is an example usage of ippSetAffinity function. To fully use processors for Intel IPP functions, users need to make sure that the number of Intel IPP threads is equal to the number of cores (half of the total logical threads on an Intel HT processors), and set the affinity to bound thread to each processor core:

ippGetNumThreads( &numThreads) ;

If( Intel HT is enabled){
ippSetNumThreads( numThreads/2 ) ;
}else{ 
ippSetNumThreads( numThreads) ;
}

ippSetAffinity(ippAffinityCompactFineCore, 0);

To find more information on threading affinity, please check:
IPP Crypto Sample Performance for OpenSSL too Slow on Hyper-Threading Systems
Training material on ippSetAffinity

Pour de plus amples informations sur les optimisations de compilation, consultez notre Avertissement concernant les optimisations.