4,391 Posts served
10,712 Conversations started
- Academic

- Android

- Art, Music, & Animation

- Embedded Computing

- Events

- Game Development

- Graphics & Media

- Intel SW Partner Program

- Intel® AppUp Developer Program

- Manageability & Security

- Mobility

- Open Source

- Parallel Programming

- Performance and Optimization

- Power Efficiency

- Site News & Announcements

- Software Tools

- Association for Computing Machinery TechNews (ACM)
- Go Parallel! (Dr. Dobbs)
- HPCwire (Tabor Communications, Inc.)
- insideHPC (John West)
- Joe Duffy's Weblog (Microsoft)
- Microsoft Parallel Programming Development Center (Microsoft Germany)
- MultiCoreInfo.com
- scalability.org (Scalable Informatics)
- Software Dev Blog (Intel Germany)
- Soft Talk Blog (Intel United Kingdom)
- The Moth (Microsoft)
Intel® Parallel Studio: Great for Serial Code Too (Episode 1)
By James Cownie (Intel) (2 posts) on December 7, 2009 at 9:50 am
Although the name might not suggest it, if you develop C or C++ code in Microsoft Visual Studio, Intel Parallel Studio may be just what you’re looking for even if you have no intention of parallelizing your code.
If you need any of these features
- Detailed memory access checking and leak detection
- Fully integrated performance analysis
- Optimizing C/C++ compiler
- Highly optimized functions for audio and video processing, signal processing or compression
then Parallel Studio is for you.
Let’s take those one by one, using a simple code as an example. The code is trying to solve the logic problem which is described like this:
Find a number consisting of 9 digits in which each of the digits from 1 to 9 appears only once. This number must also satisfy these divisibility requirements:
- The number should be divisible by 9.
- If the rightmost digit is removed, the remaining number should be divisible by 8.
- If the rightmost digit of the new number is removed, the remaining number should be divisible by 7.
- And so on, until there’s only one digit (which will necessarily be divisible by 1).
I actually solve the more general problem in which I have an N digit number in base N+1. The sample code can be found here - divisible.cpp.
As written, the code seems to work; after all, it produces the right answer. However, maybe there are still problems. I’ll use Parallel Inspector to see.
Memory Debugging
Once you have installed Parallel Studio, you will find additional toolbars in your Visual Studio.
For correctness issues like memory debugging, use the Parallel Inspector tool whose toolbars are tagged with the
symbol. (“In” here stands for “Inspector”, not the element Indium). When I click on the “Inspect” button, a dialog window opens to ask how strict I want to be. Since I’m trying to find all the problems I can, I turn the analysis up to be as intensive as it can be, by moving the slider to the bottom. This is the most time-consuming test mode, but since the code is relatively small that’s fine.
Hitting the “Run Analysis” button executes the code and produces the results.
Ouch! There are a lot of problems, despite the code seeming to be correct.
A quick look here shows that we have two invalid memory accesses at line 49 in divisible.cpp, and many memory leaks, mostly at line 46.
Obviously I need some more help, so I hit the “Interpret Results” button.
Let’s fix the memory access problems first, since those seem more dangerous than the leaks. Double clicking on the highlighted problem line shows the source code, an, below that, more detail about the problem.
Applying my brain, I can see that the loop upper bound test here is wrong, I have i<=nDigits but looking at the constructor, I can see that I allocated the digits array with only nDigits elements, so I’m accessing beyond the end of the array. The loop condition should be i<nDigits. So, the copy constructor should look like this (with the change from <= to < highlighted). You can get straight to a source editing window by double-clicking any line in the Source View.
// Copy constructor
number (const number & other)
{
nDigits = other.nDigits;
digits = new int [nDigits];
for (int i=0; i<nDigits; i++)
digits[i] = other.digits[i];
}
That bug nicely explains both of the invalid memory accesses, since the same line is both reading and writing the digits array.
Now that I’ve fixed the invalid accesses, what about the memory leaks?
I investigate them by switching back to the Inspector view by clicking on the tab for the experiment (in this case r005mi4). Now, I can select one, and then switch from the Overview to the Sources view by selecting the Sources box. That shows me a display like this:
I can now see that the source of the leak is the point in the constructor for number where I allocate space for the digits. If I look at the other leaks, (which are shown as separate errors because they occur with different call stacks), I see that all except one are caused by the same allocation, and the one which is different is in the other constructor, where it is making the same allocation. Hmm, time to read the code again…
Aha, I have committed a cardinal sin. I have a constructor which allocates space, but no destructor for the object! That’s easy enough to fix, I just have to add the obvious destructor.
// Destructor
~number()
{
delete [] digits;
}
Once I’ve done that I can re-run the memory inspector to confirm that I have fixed all the problems which it can find.
This blog is getting quite long, so I’ll defer looking at performance until next week, when I’ll also think a bit more about the complexity of the problem, and whether I really need all the code I have.
If you’re feeling keen, you can play with Parallel Amplifier by then, and see if you can find the same performance problems with my code that I do (unfortunately there are no prizes!)
Resources
If you want to follow along, or try Intel Parallel Studio on your own codes, you can download a free evaluation copy from http://software.intel.com/en-us/intel-parallel-studio-home/
The (buggy) source code for the example is here - divisible.cpp.
Categories: Parallel Programming
For more complete information about compiler optimizations, see our Optimization Notice.
Comments (0)
Trackbacks (4)
- Intel® Parallel Studio: Great for Serial Code Too - Part 1
December 7, 2009 5:41 PM PST - Intel Parallel Studio: Great for Serial Code Too (Episode 2) – Intel Software Network Blogs
April 6, 2010 12:47 PM PDT - Intel Parallel Studio: Great for Serial Code Too (Episode 2)
April 6, 2010 6:20 PM PDT - Not yet a LINQ fan? « Linguistic forms
April 9, 2010 2:26 AM PDT






