A simple, widely known and studied problem was posed to the class students: matrix multiplication. We made an internal contest, which was to obtain the fastest serial code in which the students learned a lot about compiler optimizations, and even more, the effect of caches in code performance. The objective of the contest was to extrapoloate this exercise into a massive multicore architecture. Students were given kickstart code with a naive C using an OpenMP implemention of the problem, and a series of rules.
A challenge to the class: first, write the parallel implementation of the matrix-vector multiplication algorithm where a sparse matrix stored in the CRS format is multiplied by a dense vector. Use OpenMP and run it on multicore processors. Second, write hte parallel implementation of the Dot product of two dense vecors on multicore computers.
The solution set is provided with this posting.
Задача: дана периодическая матрица произвольного размера, необходимо найти подматрицу, сумма элементов которой максимальна.
В дальнейших рассуждениях используется матрица (конкретные числа заменены номерами):
С чего начнем?