Intel® Parallel Amplifier provides Concurrency Analysis which helps to find where processor utilization is poor. Concurrency results may indicate 1) “Idle” time is bad for too long (no thread to run) – “gray” color. 2) “Ok” time is bad for too long (serial code to run) – “orange” color. 3) “Ideal” time is for time spent where the number of active threads matches the number of available hardware's cores - "green" color. 4)“Over” time is for time spent where the number of active threads is more than the number of available hardware's cores - "blue" color. We recommend the user's code to work mostly on "Ideal" time to utilize the processors perfectly.
Here is a simple example named Primes, which finds/counts total number of primes from 1 to 100,000 by using four threads in parallel.
See above Concurrency Analysis results – Elapsed Time is 1.152s, CPU Time is 1.932s. The user should understand that CPU Time is to accumulate all logical CPUs time – which were collected on serial code and parallel code. CPU Time is NOT application running time.
The column expansion button (marked ">>") on the "CPU Time by Utilization" column hearder will separate the concurrency levels into separate columns. Expanded data columns for concurrency level are 1, 2, 3, 4, etc - shown as seen below.
Actually Logical CPU Count is 2 in this case, so we can understand Elapsed Time which is calculated by the formula below –
Elapsed Time = 0.259s + 0.265s/2 + (0.437s/3)*2 + (0.971s/4)*2 = 0.259s + 0.132s + 0.291s + 0.485s = 1.167s.
1. All time spent in serial code will contribute its 100% to elapsed time: T1 = 0.259s
2. Time spent in parallel code will contribute to elapsed time as: (T/threads) * factor, factor = (threads/CPUs) up to integer
a) IDEAL: factor = 1, T2 = T/2
b) Paralle in three threads: factor = 2, T3 = (T/3)*2
c) Parallel in four threads: factor = 2, T4 = (T/4)*2
This result is approximate to “Summary” information