• 2019 Update 4
  • 03/20/2019
  • Public Content
Contents

Eliminating Device Starvation

It is important to schedule command-queue for each device asynchronously. Host-queue multiple kernels first, then flush the queues so kernels begin executing on the devices, and finally wait for results. Refer to the Section "Synchronization Caveats" for more information.
Another approach is having a separate thread for GPU command-queue. Specifically, you can dedicate a physical CPU core for scheduling GPU tasks. To reserve a core, you can use the device fission extension, using which can prevent GPU starvation in some cases. Refer to the User Manual - OpenCL™ Code Builder for more information on the device fission extension.
Consider experimenting, as various trade-offs are possible.
See Also

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804