For the Intel® compiler, vectorization is the unrolling of a loop combined with the generation of packed SIMD instructions. Because the packed instructions operate on more than one data element at a time, the loop can execute more efficiently. It is sometimes referred to as auto-vectorization to emphasize that the compiler automatically identifies and optimizes suitable loops on its own.
Intel® Advisor can assist with vectorization and show optimization report messages with your source code. See https://software.intel.com/en-us/intel-advisor-xe for details.
Vectorization may call library routines that can result in additional performance gain on Intel microprocessors than on non-Intel microprocessors. The vectorization can also be affected by certain options, such as m or x.
Vectorization is enabled with the compiler at optimization levels of O2 (default level) and higher for both Intel® microprocessors and non-Intel® microprocessors. Many loops are vectorized automatically, but in cases where this doesn't happen, you may be able to vectorize loops by making simple code modifications. In this tutorial, you will:
establish a performance baseline
generate a vectorization report
improve performance by aligning data
improve performance using Interprocedural Optimization
This tutorial is available in Linux*, macOS*, and Windows* versions.