I'm using Vtune to annalyse some AVX image processing routines I'm doing. I'm compiling with Visual Studio Pro 2012, and using the latest Vtune in 64bit C++ mode.
I have symbols turned on in my release mode, so I can see the functions in the Functions Grouping page just fine. But when I drill down into my function, most of the blocks in timings pane are empty except at the end of the loop where it increments some pointers.
For(some loop conditions)
Lots of AVX stuff
more AVX stuff
increment pointers <- Only timing blocks mentioned for this, which covers everything above.
I'm only getting CPU_CLK_UNHALTED_.THREAD timings next to the end of the loop pointer increments. Its as if it cant work out what source code maps to what assembly functions, which I find odd, as the AVX intrinsics map pretty much 1:1.
Is there anyway to improve this, other than looking at the assembler pane ?