Intel® VTune™ Performance Analyzer for Windows* - Setting Sample-After Value

Solution:

One undesirable condition for the Intel® VTune™ Performance Analyzer is to have too small of a Sample-After value. This causes the sampling interrupts to occur too frequently. In this case, the processor spends a lot of time in the analyzer sampling interrupt handler instead of doing what it is supposed to do. Once you get to extremely small Sample-After values, you may run into different problems even if you can keep the analyzer running. The recommendation (and the target of event calibration) is 1000 samples per second. For example, running at one GHz, a Sample-After value of 2,000 means 500,000 samples per second, a number that is considered too large and would mean only 2,000 CPU clocks between samples. If the analyzer sampling interrupt consumes 1000 clocks (a guess but probably not that far off), that means that the CPU is spending 50% of its time responding to analyzer interrupts and not responding to OS interrupts, disk I/O, etc.

In fact, the analyzer calculates the default Sample-After value for Clockticks to product 1000 samples per second. You should not change that value unless there is some compelling reason.

So, the recommendations for the Clockticks event Sample-After value, in order of preference, are:

  1. Don't change the Clockticks Sample-After value that is set by the analyzer. The analyzer calculates the CPU's speed and sets the Sample-After value so that there will be 1000 samples per second.
  2. Use calibration to let the analyzer run a few experiments to determine what Sample-After value is appropriate (although it should get the same answer as recommendation 1 above).
  3. Don't manually set the Sample-After value to a number so small that sample rate increases by several orders of magnitude. This will cause unpredictable results.

Note: The optimum Sample-After values for other EBS events are different than for Clockticks because the other EBS events occur at different frequencies (usually less frequently). The goal is to produce about 1000 samples per second for any EBS event.

The VTune Performance Analyzer is meant to be a statistical sampling tool and is not meant to sample after every instruction. If you would like to accomplish more accurate results, you can invest in a hardware analysis tool that plugs on top of the CPU or you can run a module many times with a lower sampling rate to find an accurate average. The Pause/Resume APIs could be useful, also.

Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.