Multi-core processors are ubiquitous now and it has become necessary to exploit the additional processing capabilities afforded by the multiple cores. Software written for single-core processors with serial execution in mind now needs to be retargeted to multi-core processors. We are obviously talking about programs which execute on a single machine and in a single process rather than those that are multi-process or those that execute on a distributed cluster.
Parallelizing your programs is fun and an exciting thing to do. After all, wouldn't you want to boast about improving the performance of your program by 2x or 4x when running on a dual-core or quad-core machine? I definitely would! :)
One way to do this is to identify a good chunk of work in your program which can be executed safely in parallel on several cores and, if possible, model the possibilities before actually deciding and implementing threading to divide this parallel work.
Some things to consider here are:
- How to identify a "good chunk of work which can be done in parallel"?
- How to safely execute parallel portions of your program? In other words, how to avoid introducing instability into your parallel program when you do the work in parallel (remember, it is a lot harder to debug and triage instabilities in parallel programs)?
- How to model the various possibilities for introducing threading without actually doing the actual work?
Let us explore the solutions to each of these questions.
1. "Good chunk of work" can often be found relatively easily by running your program under a performance analyzer which can report areas of your code where the most time is spent. Once you identify such "hotspots", with a bit of effort and code inspection, you can find an appropriate insertion point for incorporating threading.
2. What is relatively harder is determining if the "good chunk of work" can be executed safely in parallel. It is rather easy to introduce serious instability in your program if the parallel chunks of work interact with each other or trample each other's data. It is much harder to debug and triage instabilities in parallel programs than it is in serial programs. This is perhaps the most important piece of information you need before embarking on threading a piece of code.
3. It is also desirable to explore the possibilities for introducing threading without actually writing the code to do so. This way, you don't yet have to decide which threading paradigm to use, you don't yet have to learn the threading paradigm and you don't yet have to incur the cost of development time implementing it and stabilizing it in your code. Not yet, at least. What if you incur all this effort and cost, and then decide that it wasn't worth it? It would be a tragic waste! During the exploration phase, you simply want to learn about opportunities for threading and the associated costs (number of potential instabilities, for instance) so that you can make good decisions early on. Threaded code development can be non-trivial and it would be nice if one can simply model pieces of a program for parallelism. This approach can improve productivity and may contribute to your bottom line.
While the task of parallelizing a program using software threads is fun, it does require making some good choices. Here is where Intel® Parallel Advisor can be of help. Parallel Advisor alleviates all the 3 problems detailed above. In my future blogs, I endeavor to explore each of those areas in greater detail. If you are eager, please explore more details about the product here: Intel® Parallel Advisor. You can download and evaluate the full functionality of the product for 30 days even!
What is your approach to parallelizing your program? In what way does it differ from the approach detailed in this blog? Please share!
Please don't hesitate to drop me a note if you have any questions! Your feedback, questions and comments are most welcome and much appreciated.