Parallelize loops with Intel® Threading Building Blocks using Intel® C++ Compiler for lambda expressions.
The Intel® Compiler provides SIMD intrinsics APIs for short vector math library (SVML) and starting with Intel® Advanced Vector Extensions
Find out how to use the command-line interface in Intel® Advisor 2017 for a quick, initial analysis of loop performance that gives an overview of the hotspots in your code.
Matrix multiplication (MM) of two matrices is one of the most fundamental operations in linear algebra. The algorithm for MM is very simple, it could be easily implemented in any programming language. This paper shows that performance significantly improves when different optimization techniques are applied.
Get a background on vectorization and learn different techniques to evaluate its effectiveness.