Introduction to Parallel Programming video lecture series – Part 11 “Improving Parallel Performance”

The lecture given here is the eleventh (and penultimate) part in the “Introduction to Parallel Programming” video series. This part starts by explaining why less than optimal serial algorithms can be easier to parallelize. The concepts of temporal and data locality are defined and why maximizing these within parallel programs will pay off in performance dividends. The latter part of the lecture demonstrates how loop fusion, loop fission, and loop inversion can be used to create or improve opportunities for parallel execution. Code and pictorial examples are used to illustrate the main topics of the lecture.

Running time: 15:12

Note: The material presented in this lecture series has been taken from the Intel Software College multi-day seminar, “Introduction to Parallel Programming”, authored by Michael J. Quinn (Seattle University). The content has been reorganized and updated for the lectures in this series.

There are downloads available under the Creative Commons License license. Download Now

Include in RSS: 

For more complete information about compiler optimizations, see our Optimization Notice.