Finding Suitable Sites for Parallelism using Intel® Parallel Advisor Lite

By Jackson M (Intel) (3 posts) on September 8, 2009 at 12:53 pm

As hardware trends away from faster clocks towards more cores per chip, software must adapt to take advantage of multi-core architectures. Performance gains will have to come from parallelizing applications instead of waiting for more cycles per second.

Intel® Parallel Advisor Lite along with Intel® Parallel Studio lay out a multi-step process to aid developers in transitioning their serial code to parallel code. This blog article will focus on the first step of the process: How to determine where to add parallelism in an application.

Parallel Advisor Lite provides an easy to use GUI interface as a plug-in to Microsoft® Visual Studio that can help developers profile their code and understand where most of the time is being spent. These “Hotspots” are good starting points when deciding where to add parallelism in an application. Figure 1 shows a screenshot of some profile data generated by Parallel Advisor Lite on a K-Nearest Neighbors application.


Figure 1

Figure 1 shows that the majority of the program is spent in a function called “vector” (see the top function vector in the screenshot above) and a method called “KNN::distance”. Navigating to the vector function reveals that it is part of the Standard Template Library (STL). In my quest to parallelize my application I’m going to rule out diving into the STL and focus on the KNN::distance method. With the click of a mouse, Parallel Advisor Lite will automatically navigate me to the source code (definition) of this method. Figure 2 shows the breakdown of time spent in KNN::distance.


Figure 2

By looking at this method I can see that a single call shouldn’t take very long. All it does is calculate the distance between two points and it spends almost as much time returning the value as it does in the calculation. If I wrote this method myself I might have known it was fast without even navigating to the code.

If this was all the information that I had, I may be out of luck for parallelization because even though I’ve found the location of the bulk of my work, it doesn’t appear to be very conducive to parallelism. Now let’s switch to the Top-Down profile view that exposes where the calls to KNN::distance are coming from to open up more possibilities for parallelism.


Figure 3

Figure 3 revels that the KNN::predict method is making calls to distance and by looking at the source code I can see that it makes these calls in the body of a loop.

Now I’ve identified another possible site for parallelism that still focuses on the Hotspots of my code. This site appears to be a loop with independent iterations (the execution of one iteration isn’t dependent on a previous iteration), which is a prime candidate for parallelization.

The way Parallel Advisor Lite encourages us to use profile information can be very useful in uncovering the not-so-obvious locations where introducing parallelism may greatly improve performance.

You can download a free copy of Parallel Advisor Lite and an evaluation copy of Intel® Parallel Studio to try them out. I’ve used the tool and found that the Top-down and Bottom-up views are both useful in finding sites to parallelize. The Bottom-up view helps to identify the functions that are consuming the most time. The Top-down view can present call path information that may reveal better locations to introduce parallelism while still focusing on the hotspots. The timing breakdowns also reveal interesting characteristics about my applications that I wouldn’t have guessed, such as the heavy overhead of vector operations.

Do you think there are better ways to visualize the data?

Is there anything missing that you wish was presented?

Have you used the tool and been surprised by the results?

Please share your experiences and perspectives!

Categories: Parallel Programming, What If Software

Comments (2)

September 9, 2009 8:43 AM PDT


Sundar Srinivasan
This looks awesome. Any plan of making it available for other operating systems. I have Linux and OpenSolaris and don't have big bucks to buy Visual Studio.NET. I can develop parallel code using Cilk, but debugging and visualizing how much of the code can be run in parallel etc. are real pain in the backside.
September 9, 2009 1:28 PM PDT

Jackson M (Intel)
Total Points:
270
Status Points:
220
Green Belt
Intel® Parallel Advisor Lite technology preview is targeted only for Microsoft Windows for the foreseeable future. For Linux flavors, Intel® VTune Performance Analyzer and Intel® Thread Profiler may serve your needs: http://software.intel.com/en-us/intel-vtune/ .
Let me know if you have any additional questions.

Trackbacks (1)


Leave a comment  

To obtain technical support, please go to Software Support.
Name (required)*

Email (required; will not be displayed on this page)*

Your URL (optional)


Comment*