cl_device_profiling_timer_resolution

cl_device_profiling_timer_resolution

Hi,

The SDK seems to be reporting sporadic/incorrect values for the profiling timer resolution. For example, using the following code:

cl_ulong timerResolution;
clGetDeviceInfo(device, CL_DEVICE_PROFILING_TIMER_RESOLUTION, sizeof(cl_ulong), &timerResolution, NULL);
printf("Timer Resolution: %dn", timerResolution);

I am getting a different (and clearly incorrect) value each time. The same code works for other devices, and for the AMD SDK on the same CPU (Intel E8500). I'm using the latest SDK on 64-bit Scientific Linux 6.

Fairly sure this is a bug in the SDK, just wanted to bring it to your attention in case you're not already aware.

Cheers.

7 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi,
Thanks for the report.
Can you specify Linux version you are using.
Currently we support only SUSE11 and RedHat 6.

Hi,

I'm using Scientific Linux 6.0, which I believe is based on RHEL 6 and should (in theory) be fully compatible with anything RHEL can run. However, I'm not sure if that's a close enough link to be covered by your support, so I apologise if this bug report isn't valid. I'm afraid I don't have access to any pure RHEL6/SUSE11 systems to test this out on.

Thanks.

Have you happened to read and use the guidelines at:OpenCL Code Performance Debugging IntroThis might help you to understand how to read counters and how to understand your performance.
Enjoy,Arnon

Hi,

From your profiling guide:

//the resolution of the events is 1e-06   
g_NDRangePureExecTime = (cl_double)(end - start)*(cl_double)(1e-06); 

You seem to be assuming that the resolution is always 1e-06. While this might be true for Intel CPUs, it certainly isn't true for other OpenCL capable devices, and therefore this method of timing is not at all portable.

The whole point of the CL_DEVICE_PROFILING_TIMER_RESOLUTION is that it allows you to time OpenCL operations in a device portable way.

Does this mean that the Intel SDK doesn't implement this parameter?

Thanks.

Hi,

Thanks again for your reports.

1. We h've identified the problem ofCL_DEVICE_PROFILING_TIMER_RESOLUTION and it will be fixed toward next release.
Meanwhile, please useclock_getres(CLOCK_MONOTONIC, ...) in order to query the CL_DEVICE_PROFILING_TIMER_RESOLUTION. The updatedversion will return the same value.

2. The OpenCL events always return performance counter values in nano-seconds, which are 10^-9. We will correct the typo and add clarification to this section.

Thanks,
Evgeny

Many thanks, I'll look out for the next release.

Leave a Comment

Please sign in to add a comment. Not a member? Join today