Earlier in the month I fleshed out a spatially arranged subdivision method I learned from Matteo Frigo but didn’t have time to actually run it and compare against my baselines. And in the meantime my test machine has been regrooved into a Windows 7 box, so my first order of business is to retest my baselines. I reran the serial algorithm a few times and averaged the results. While the raw numbers (using the same compiler and machine but j
Hi All, first time blogging on ISN. I want to get right into this, so I'll save formal introductions for another time, but there should be a short bio on my profile if you're interested.
Intel® Parallel Amplifier is an excellent tool for identifying hotspots and measuring CPU utilization. Using Amplifier’s Concurrency analysis it’s very easy to find the places in an application that poorly utilize the CPU, but root-causing these issues is often more complex. To do this, you need to understand the runtime behavior of the application - how many threads are actually running, how the work is distributed between the threads and where thread execution is “serialized”.
Intel® Parallel Studio Service Pack 1 is now available, adding support for Windows* 7.
SP1 is well worth downloading and installing - here are some of the reasons:
- Parallel Inspector and Parallel Amplifier can be driven (for automating test suites) from the command line now.
- Bug fixes - of course - not many issues needed fixing, but you may appreciate the ones bugs that were found and fixed!
Having discovered which function consumes most of the time in the serial algorithm last time, there’s still more to discover by narrowing the focus to a specific function of interest. Our function, shown last time and below, is addAcc.
In my last venture I got the n-bodies program to compile and ran a test series with the serial algorithm, showing the n-squared nature of the basic problem. I mean to write a parallel version of this (heh, heh, heh) but first I need to know what is taking up the time.