openmp nested parallelism

openmp nested parallelism


I am trying to understanding how to specify thread affinity in the case of nested parallelism. I am not sure if I can use KMP_AFFINITY in this case. I have 2 level of parallelism. At the first level, I have a parallel loop. I would like for this loop to run each thread on a different processor (I have 10 proc. per core). This corresponds to use the type scatter. Inside the parallel loop I am using multithread openmp MKL routines. For mkl, I need to use compact. This is a beginner question, but what is the way to get this result. Also, to make things a little bit more complicated, I am using mkl, not in a parallel region, before the loop. This means I need to change the affinity inside my code.

Thanks for helping,


7 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

MKL attempts to detect this scenario and choose an optimal number of threads automatically.

If that is not working, try setting number of threads using mkl_set_num_threads().

But if you really want to use nested threading, these affinity settings may help.







OMP_PROC_BIND=“spread, close”


Thanks, it will help. Is it possible to modify the settings inside the code. I am using mkl before the parallel. So i would need OMP_PROC_BIND=close for MKL and to switch "spread,close" after. I am assuming I can simply set the env. variables inside the code. Is it correct?


In general MKL routines perform best with 1 thread per core on Intel Xeon processors.  Just set KMP_AFFINITY=scatter, and if the prospect of MKL generating additional threads inside parallel region is troubling, temporarily change MKL number of threads to 1 with mkl_set_num_threads().




Here are two examples how OpenMP threads are pinned to different cores on a KNL server for KMP_AFFINITY set to scatter and compact.



Leave a Comment

Please sign in to add a comment. Not a member? Join today