Developer Guide

Contents

NDRange Kernels

If your program naturally tends to describe multiple concurrent threads operating in a data-parallel manner, specify your kernel to operate in parallel instances over a work-item index-space (NDRange).

Avoid Work-Item ID-Dependent Backward Branching

The 
Intel® oneAPI
DPC++/C++
Compiler
collapses conditional statements into single bits that indicate when a particular functional unit becomes active. The
Intel® oneAPI
DPC++/C++
Compiler
eliminates simple control flow paths that do not involve looping structures, resulting in a flat control structure and more efficient hardware use.
Avoid including any work-item ID-dependent backward branching (that is, branching that occurs in a loop) in your kernel because it degrades performance.
For example, the following code fragment illustrates branching that involves work-item ID such as 
get_global_id
 or 
get_local_id
:
for (size_t i = 0; i < get_global_id(0); i++) { // statements }

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.