Hi. I have some problem. I write aplication for Intel Xeon Phi (61 cores), which does stencil calculation using 2D matrix (five-point stencil). I would like to use OpenMP 4.0 teams. I would like to create teams which consist of 4 threads running on each core for example Team 1 - threads 1,2,3,4, Team 2 - threads 5,6,7,8 ect, because i try reduce cache miss by doing caculation for 4 threads around the same L2 cache. I tried fix it by set KMP_AFFINITY="proclist=[1,5,9,...,237,2,3,4,6,7,8,10,11,12,...,238,239,240],explicit". This affinity work for a small count of teams. Is any way to set affinity, which solve my problem? Thanks.
For more complete information about compiler optimizations, see our Optimization Notice.