General Matrix Multiply
In this tutorial, we will give an in-depth presentation of the architecture and micro-architecture of the media and graphics accelerator. We will explain the tradeoff between general purpose compute and hardware fixed functions. We will discuss the advantages and disadvantages of on-die integration. We will present the various programming models that are supported. We will present some...
Download PDF (1.5 MB)
Download the pdf version of the article
This paper details the implementation of out of order queues, an OpenCL™ construct that allows independent kernels to execute simultaneously whenever possible, and thus keep all GPU assets fully utilized.
Please see the new portal for OpenCL™ deployments prior to accessing this legacy content.
Learn core concepts of developing OpenCL™ applications with Intel® SDK for OpenCL™ Applications 2019.