Classroom challenge: Matrix Multiplication, Performance and Scalability in OpenMP

A simple, widely known and studied problem was posed to the class students: matrix multiplication. We made an internal contest, which was to obtain the fastest serial code in which the students learned a lot about compiler optimizations, and even more, the effect of caches in code performance. The objective of the contest was to extrapoloate this exercise into a massive multicore architecture. Students were given kickstart code with a naive C using an OpenMP implemention of the problem, and a series of rules. The kickstart code and the better student results are included in this posting.

There are downloads available under the Creative Commons License license. Download Now
For more complete information about compiler optimizations, see our Optimization Notice.