• 2019 Update 4
  • 03/20/2019
  • Public Content
Contents

Vectorization Basics for Intel® Architecture Processors

Intel® Architecture Processors provide performance acceleration using Single Instruction Multiple Data (SIMD) instruction sets, which include:
  • Intel Streaming SIMD Extensions (Intel SSE)
  • Intel Advanced Vector Extensions (Intel AVX) instructions
  • Intel Advanced Vector Extensions 2 (Intel AVX2) instructions
By processing multiple data elements in a single instruction, these ISA extensions enable data parallelism in scientific, engineering, or graphics applications.
When using SIMD instructions, vector registers hold group of data elements of the same data type, such as
float
or
char
. The number of data elements that fit in one register depends on the microarchitecture, and on the data type width, for example: starting with the 2nd Generation Intel Core™ Processors, the vector register width is 256 bits. Each vector (YMM) register can store eight
float
numbers, eight 32-bit
integer
numbers, and so on.
When using the SPMD technique, the OpenCL™ standard implementation can map the work-items to the hardware according to:
  • Scalar code, when work-items execute one-by-one.
  • SIMD elements, when several work-items fit in one register to run simultaneously.
The OpenCL Code Builder contains an implicit vectorization module, which implements the method with SIMD elements. Depending on the kernel code, this operation might have some limitations. If the vectorization module optimization is disabled, the SDK uses the method with scalar code.
See Also

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804