Performance Issue

Performance Issue

Hi, 

I am using Intel Parallel Studio XE 2015 on two different workstations. One has single Intel Xeon X5570 ( 8 cores ) processor with RHEL 5 Operating System  lets called it workstation1 and Second has two Intel Xeon E5-2690 v3 (12 core each CPU) with RHEL 7 Operating System called it workstation2. I run same C++ code in both workstations with different number of thread but in Workstation1 with increase in number of thread up to 8 I get reduction in time exponentially but with same is not go well with workstation2. In workstation2 with increasing number of threads upto 12 I get reduction in time but not as good as I get in workstation1 and after 12 threads it there is increment of time as well with increment of threads. I am not able to understand why workstation2 with higher computing capacity is not able to perform better than workstation1 and why there is performance lagging in workstation2 with increasing number of thread. Kindly provide solution to my problem.

 

Thanks in advance

rdg006

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

It's much more important to pin threads to cores when running on the dual CPU box, e.g. OMP_PROC_BIND=close, due to the much stronger NUMA characteristics.

Thanks Jim P.  for response but can u explain me difference between different  bind clauses like master,close and strict and also about GOMP_CPU_AFFINITY.  

I don't think the moderators will approve of general discussions on OpenMP here.  The basic issue is that you must keep the threads pinned and work spread evenly among CPUs as remote NUMA data access incurs significant penalties.  gomp_cpu_affinity should work if you are using libgomp on a target where affinity is implemented, but this indicates that you don't intend to observe topicality of this forum. 

Leave a Comment

Please sign in to add a comment. Not a member? Join today