Host-Side Timing

The following code snippet is a host-side timing routine around a kernel call (error handling is omitted):

float start = …;//getting the first time-stamp
        clEnqueueNDRangeKernel(g_cmd_queue, …);
        clFinish(g_cmd_queue);// to make sure the kernel completed
float end = …;//getting the last time-stamp
float time = (end-start);

In this example, host-side timing is implemented using the following functions:

  • clEnqueueNDRangeKernel adds a kernel to a queue and immediately returns
  • clFinish explicitly indicates the completion of kernel execution. You can also use clWaitForEvents.

Wrapping the Right Set of Operations

When using any host-side routine for evaluating performance of your kernel, ensure you wrapped the proper set of operations.

For example, avoid potentially costly and/or serializing routine, like:

  • Including various printf calls
  • File input or output operations
  • and so on

Also profile kernel execution and data transferring separately by using OpenCL™ profiling events. Similarly, keep track of compilation and general initialization costs, like buffer creation separately from the actual execution flow.

See Also

Profiling Operations Using OpenCL™ Profiling Events

For more complete information about compiler optimizations, see our Optimization Notice.