Performance Essentials with OpenMP 4.0 Vectorization

Performance Essentials with OpenMP 4.0 Vectorization

Techniques a developer can use to utilize vector hardware to potentially improve application performance by using explicit vector programming methods with OpenMP* 4.0 in C/C++.

  • Entwickler
  • C/C++
  • Intel® Parallel Studio XE
  • OpenMP 4.0
  • Explicit Vector Programming
  • pragma omp declare simd
  • vector lanes
  • OpenMP*
  • Vektorisierung
  • Vectorization Essentials

    Compiler Methodology for Intel® MIC Architecture

    Vectorization Essentials


    This chapter covers topics in vectorization. Vectorization is a form of data-parallel programming where the processor performs the same operation simultaneously on N data elements of a vector (a one-dimensional array of scalar data objects such as floating point objects, integers, or double precision floating point objects).

  • Entwickler
  • Linux*
  • C/C++
  • Fortran
  • Experten
  • Intel® C++-Compiler
  • Intel® Fortran Compiler
  • OpenMP*
  • Auto-vectorization
  • Intel® Xeon Phi™ Coprocessor
  • vectorization
  • compiler methodology
  • MIC
  • Intel® Cilk™ Plus
  • openmp
  • Intel® Many Integrated Core Architektur
  • Three progamming points to mention on Offloaded Code for Intel® Graphics Technology

    Intel® Graphic Technology is a supported part of the compiler product. Developers should adhere to the programming guidelines in order to benefit from the compiler and GT features efficiently.

    1."#pragma offload target(gfx)" is required to mark the parallel loop as an "offload region".  The "__declspec(target(gfx))" does not do that.  It merely states that the function should be compiled to run on the GFX target.

    For example, the following incorrect code snippet use is from a customer:

  • Entwickler
  • Microsoft Windows* (XP, Vista, 7)
  • Code for Good
  • C/C++
  • OpenMP*
  • OpenMP* abonnieren