Vector programming. SSE4.2 to AVX2 conversion examples.

In this blog I’ll try to show how to convert SSE4.2 assembly to AVX2 (using the schemes from the blog Programming using AVX2) and how this affects performance.

  • Easy case. When it is enough to add “v” prefix and replace “xmm” with “ymm”.

Consider we have the following loop:

Parallel Sparse Matrix-Vector Multiplication on multi-core computers

A challenge to the class: first, write the parallel implementation of the matrix-vector multiplication algorithm where a sparse matrix stored in the CRS format is multiplied by a dense vector. Use OpenMP and run it on multicore processors. Second, write hte parallel implementation of the Dot product of two dense vecors on multicore computers.

The solution set is provided with this posting.

Iscriversi a vector