This will be the final post in my planned short vectorization series. Although I reserve the right to post more on vectorization in the future! In the first post on this topic, I explained that vectorization was parallelism inside a single CPU core, achieved by applying a CPU instruction to multiple data elements at once.
A technical talk on the Condition Numerical Reproducibility (CNR) feature in Intel® MKL 11.0
A tutorial on how to use #pragma simd and SIMD-enabled function features in Intel® Cilk™ Plus.