Omit Hardware that Generates and Dispatches Kernel IDs
The
[[intel::max_global_work_dim(0)]]
kernel attribute instructs the
Intel® oneAPI
to omit logic that generates and dispatches global, local, and group IDs into the compiled kernel.
DPC++/C++
CompilerSemantically, the
[[intel::max_global_work_dim(0)]]
kernel attribute specifies that the global work dimension of the kernel is zero. Setting this kernel attribute means that the kernel does not use any global, local, or group IDs. The presence of this attribute in the kernel code serves as a guarantee to the compiler that the kernel is a single work-item kernel.
When compiling the following kernel, the compiler generates interface hardware as illustrated in
Figure 1:
cgh.single_task<class kernelComputeAsTask>( [=]() [[intel::max_global_work_dim(0)]] { for (unsigned i = 0; i < SIZE; i++) { accessorRes[i] = accessorIdx[i] * 2; } });
The
[[intel::max_global_work_dim(0)]]
attribute must be run as a task and not as a
parallel_for
function.
Compiler-generated Interface Hardware for a Kernel with the
[[intel::max_global_work_dim(0)]]
Attribute
If your current kernel implementation has multiple work-items but does not use global, local, or group IDs, you can use the
[[intel::max_global_work_dim(0)]]
kernel attribute if you modify the kernel code accordingly:
- Wrap the kernel body in aforloop that iterates as many times as the number of work-items.
- Usecgh.single_task<kernelName>to invoke the device code.