• 2019 Update 4
  • 03/20/2019
  • Public Content

Use Preprocessor for Constants

Consider the following kernel:
__kernel void exponentor(__global int* data, const uint exponent) { int tid = get_global_id(0); int base = data[tid]; for (int i = 1; i < exponent; ++i) { data[tid] *= base; } }
The number of iterations for the inner
for loop
is determined at runtime, after the kernel is issued for execution. However, you can use OpenCL™ dynamic compilation feature to ensure the exponent is known at kernel compile time, which is done during the host run time. In this case, the kernel appears as follows:
__kernel void exponentor(__global int* data) { int tid = get_global_id(0); int base = data[tid]; for (int i = 1; i < EXPONENT; ++i) { data[tid] *= base; } }
The capitalization indicates that
is a preprocessor macro.
The original version of the host code passes
through kernel arguments as follows:
clSetKernelArg(kernel, 1, exponent_val);
The updated version uses a compilation step:
sprintf(buildOptions, “-DEXPONENT=%u”, exponent_val); clBuildProgram(program, <...>, buildOptions, <...>);
Thus, the value of the
is passed during preprocessing of the kernel code. Besides saving stack space used by the kernel, this also enables the compiler to perform optimizations, such as loop unrolling or elimination.
This approach requires recompiling the program every time the value of
changes. If you expect to change this value often, this approach is not advised. However, this technique is often useful for transferring parameters like image dimensions to video-processing kernels, where the value is only known at host run time, but does not change once it is defined.

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804