| August 29, 2010 7:00 PM PDT | |
Intel® Parallel Advisor 2011 (Advisor) along with the other Intel® Parallel Studio 2011 tools lay out a multi-step process to aid developers in transitioning their serial code to efficient and correct parallel code. This article will focus on the first step of the process: How to determine where to add parallelism in an application.
Advisor provides an easy to use GUI interface as a plug-in to Microsoft® Visual Studio. The first step in using Advisor is to run the Survey tool, which will help determine where most of the time in the application is being spent. These “Hotspots” are good starting points when deciding where to add parallelism in an application. Figure 1 shows a screenshot of some profile data generated by Advisor on a K-Nearest Neighbors application.
Figure 1
Figure 1 shows that the majority of the program (59.3%) is spent in a function called std::::vector() and 20.9% in a loop in the method called KNN::distance. Navigating to the vector function reveals that it is part of the Standard Template Library (STL). The best return on investment will most likely be to focus on the KNN::distance method as opposed to the STL. With one mouse click, Advisor will automatically navigate to the source code (definition) of this method. Figure 2 shows the breakdown of time spent in KNN::distance.
Figure 2
Looking at this method, it can be seen that a single call should not take very long. It only calculates the distance between two points and it spends quite a bit of time returning the value in addition to the calculation.
If this was all the information that Advisor presented, it may be difficult to continue parallelizing because the bulk of the work doesn’t appear to be very conducive to parallelism. However, the Survey Report provides much more detailed information.
Figure 3
Figure 3 revels that the KNN::predict method is making calls to distance and the Survey Report also shows that these calls are made in the body of a loop (see the highlighted "loop" line).
This loop identifies another possible site for parallelism that still focuses on the Hotspots of the code. This site appears to be a loop with independent iterations (the execution of one iteration isn’t dependent on a previous iteration), which is a prime candidate for parallelization.
The way Advisor encourages the use of profile information can be very useful in uncovering the not-so-obvious locations where introducing parallelism may greatly improve performance. The Survey Report can present call path information that may reveal better locations to introduce parallelism while still focusing on the Hotspots. The timing breakdowns also reveal interesting characteristics about applications that may not be obvious, such as the heavy overhead of vector operations.
After locating ideal sites for parallelism, the next steps in the Advisor Workflow are to insert Advisor Annotations to gather Suitability and Correctness data about the proposed parallelism to determine the easiest and most efficient way to parallelize an application.
This article applies to: Parallel Programming, Intel® Parallel Advisor Knowledge Base
For more complete information about compiler optimizations, see our Optimization Notice.


