event profiling data on HD4000 always starts at 0

event profiling data on HD4000 always starts at 0

Hi,

I'm trying to profile my application using the clGetEventProfilingInfo(). Rather than just looking at the kernel execution time (CL_PROFILING_COMMAND_END - CL_PROFILING_COMMAND_START) I'd like to know the absolute value of when the kernels were executed, so I can draw a timeline. When using the CPU device this works just fine, but on the GPU device (HD4000) the timer seems to be reset every time.

I wrote a simple program that calls a kernel N times, each time followed by a call to clFinish. This is the information I get from the profiler

0 .. 15120 .. 11875440 .. 18972640
0 .. 6960 .. 917520 .. 7889600
0 .. 5760 .. 148800 .. 7247600
0 .. 5920 .. 159520 .. 7335520
0 .. 5840 .. 154000 .. 7310320
0 .. 5840 .. 199520 .. 7343760
0 .. 5920 .. 156640 .. 7296240
0 .. 6000 .. 148720 .. 7254880
0 .. 6080 .. 150240 .. 7294000
0 .. 6240 .. 149680 .. 7293920

The numbers are the return values of clGetEventProfilingInfo with CL_PROFILING_COMMAND_QUEUED, CL_PROFILING_COMMAND_SUBMIT, CL_PROFILING_COMMAND_START and CL_PROFILING_COMMAND_END, respectively.

As you can see the timer always starts at 0 again. It seems like each event has it's own counter that starts at 0 when the kernel is launched. Why is the behaviour on the GPU different to the behaviour on the CPU?

Dominik

5 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

I'll take a look and get back to you on what's going on.

Thanks,
Raghu

Considering this post is two years old, I don't really expect a reply, but I just ran into this problem myself and was wondering if there was any resolution?

Thanks, ~ben

Ben,

What processor, OS (including OS version), and driver version are you using? Which version of OpenCL?

Thanks!

Robert

Ben,

I just tried the latest and greatest driver on Windows 8 and even in OpenCL 1.2, five counters are supported: CL_PROFILING_COMMAND_QUEUED, CL_PROFILING_COMMAND_SUBMIT, CL_PROFILING_COMMAND_START, CL_PROFILING_COMMAND_END, and CL_PROFILING_COMMAND_COMPLETE.

I query and print them in the following way and they seem to return reasonable increasing values (at least for QUEUED,  START, and COMPLETE).

    ciErrNum = clGetEventProfilingInfo(pmy_events[i], CL_PROFILING_COMMAND_QUEUED, sizeof(cl_ulong), &param_command_queued, 0);
    ciErrNum = clGetEventProfilingInfo(pmy_events[i], CL_PROFILING_COMMAND_SUBMIT, sizeof(cl_ulong), &param_command_submit, 0);
    ciErrNum = clGetEventProfilingInfo(pmy_events[i], CL_PROFILING_COMMAND_START, sizeof(cl_ulong), &param_command_start, 0);
    ciErrNum = clGetEventProfilingInfo(pmy_events[i], CL_PROFILING_COMMAND_END, sizeof(cl_ulong), &param_command_end, 0);
    ciErrNum = clGetEventProfilingInfo(pmy_events[i], CL_PROFILING_COMMAND_COMPLETE, sizeof(cl_ulong), &param_command_complete, 0);
    printf("iteration: %d, QUEUED: %lu, SUBMIT: %lu, START: %lu, END: %lu, COMPLETE: %lu\n", i, param_command_queued, param_command_submit, param_command_start, param_command_end, param_command_complete);

 

Lascia un commento

Eseguire l'accesso per aggiungere un commento. Non siete membri? Iscriviti oggi