This blog contains additional content for the article "Advanced Vectorization" from Parallel Universe #12:
This series of two articles discusses how data and memory layout affect performance and suggests specific steps to improve software performance. The basic steps shown in these two articles can yield significant performance gains. These two articles are designed at an intermediate level. It is assumed the reader desires to optimize software performance using common C, C++ and Fortran* programming...
See how the new Intel® Advanced Vector Extensions 512CD and the Intel AVX512F subsets (available in the Intel® Xeon Phi processor and in future Intel Xeon processors) lets the compiler automatically generate vector code with no changes to the code.
We have added a new simple SGEMM example to the Intel® SPMD Pro
Cython* is a superset of Python* that additionally supports C functions and C types on variable and class attributes. Cython generates C extension modules, which can be used by the main Python program using the import statement.