OpenCL GPU analysis working partially

OpenCL GPU analysis working partially

First of all, many thanks to the VTune team to implementing OpenCL GPU profiling to the tool.

I was trying out the tool on a command-line OpenCL application running on the HD 4000. I followed the documentation and was able to enable the GPU profiling support in VTune. I profiled my application and  some metrics such as average execution time of the kernel, EU array busy and stalled work fine. However, some other metrics, such as memory bandwidth, still report 0.0. I have the latest HD 4000 driver installed with OpenCL 1.2 support. My application is not using DirectX, only OpenCL.

Any ideas?

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Thanks for trying VTune GPU OpenCL support.
On some platforms collection of extended metrics as L3 misses, memory accesses, sampler busyness, SLM accesses etc., are disabled by default. To enable them - check BIOS - it might have the option (somewhere in Graphics "section") Intel(R) Graphics Performance Analyzers Enabled/Disabled. Make it "Enabled".
However, this option is BIOS vendor specific and some BIOSes might not have it or name it differently
Hopefully this help.

BTW: What VTune 2013 Update do you use Update 4 or Update 5?

Thank you.

Thanks for the info. Unfortunately I am using a laptop from MSI and the BIOS is barebones, and I don't see any options to enable graphics performance analyzers. I will check with the manufacturer to see if they can add the option though I am not optimistic. I am using Core i7 3610QM.

I am using VTune Amplifier XE 2013 Update 5.

thanks for letting know.
This limitation (with partial metrics) only applies to 3rd generation Intel ® Core™ Processors. All these metrics will be available out-of-the box when using 4th generation Intel® Core™ processors.

Leave a Comment

Please sign in to add a comment. Not a member? Join today