Using Intel® Inspector and Intel® VTune™ Amplifier

Intel® Advisor helps you:

  • Discover where to add parallelism to your program by identifying where your program spends its time. You propose parallel code regions when you annotate the parallel sites and tasks.

  • Predict the performance you might achieve with the proposed parallel code regions.

  • Predict the data sharing problems that could occur in the proposed parallel code regions.

Intel Advisor does not catch all problems, and it cannot ensure that you have correctly implemented the parallelism. Before deploying your parallel program, you need to test it for correctness and verify its performance. To do this, you can use other tools provided in Intel® Parallel Studio XE, Intel® Cluster Studio XE, or similar Intel suite.

The thread error analysis provided by Intel® Inspector and the Intel Advisor Correctness analysis tool use similar technology. Intel Inspector includes a data race and deadlock detection tool that works on the parallel code. It can find more errors because it operates on the parallel code instead of working on the annotated serial code analyzed by the Correctness tool. Intel Inspector also can find problems with memory: memory leaks, references to freed storage, references to uninitialized memory, and so forth. The memory-checking tool works on serial or parallel code.

Similarly, the Intel Advisor Survey and Suitability tools provide features found in the Intel® VTune™ Amplifier. The Survey tool profiles your program to find hotspots and the Suitability tool makes predictions of approximate parallel performance including overhead costs based on the Intel Advisor annotations. When you have a working parallel program, you should use Intel VTune Amplifier to measure the parallel program gain and core utilization, as well as check whether the parallel framework overhead is acceptable.

Once you have parallel code, you should:

  • Measure the speedup.

  • Make adjustments if locks are causing excessive delays, or if one task runs much longer than others.

Intel VTune Amplifier has many features to help you find and fix performance problems in your parallel code. It also helps you check:

  • Where are the hotspots now?

  • Am I missing opportunities for more parallelism?

  • Is my program spending a lot of time waiting?

  • How does the performance compare to that of prior versions?

Another technique is to use a debugger to debug a serial version of your parallel program with the parallel constructs in reverse order (see the link under See Also below).

For more complete information about compiler optimizations, see our Optimization Notice.