• 10/30/2018
  • Public Content

About This Document

This guide describes the optimization guidelines of OpenCL™ applications targeting the Intel® Core™ and Intel® Xeon® Processors.
In case your application targets Intel® processors with Intel® Graphics, refer to the corresponding
OpenCL™ Developer Guide for Intel® Processor Graphics.
Intel® Xeon Phi™ coprocessor based on the Intel® Many Integrated Core (Intel® MIC) Architecture is supported only on OpenCL™ Runtime version 14.2.
This guide describes three basic factors most influence performance on the multi-socket systems:
  • Threading scalability
    . Multi-socket Intel Xeon systems combine many Intel® CPU cores, thus utilizing thread parallelism is critical to achieving good performance.
  • Vectorization
    . Intel Xeon processors support wide vector registers and associated SIMD operations.
  • Memory bandwidth utilization
This guide explains, which sections of code consume most compute cycles, and provides optimization best-known methods.
For better understanding of the optimizations described in this guide, you must be familiar with the following concepts:
  • The OpenCL standard
  • Threading and Single Instruction Multiple Data (SIMD) vector instruction sets

See Also

OpenCL™ 1.2 Specification at
Overview Presentations of the OpenCL™ Standard at

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.