Is there a disassembler for the OpenCL SDK that will display the GPU code produced for a kernel?
We do not provide any disassembler for displaying GPU code and currently don't have any plans to provide one. I am curious why you want to look at the disasm. Do you just want to understand what code gets generated or may be for debugging purposes? Please let me know your reasons and I will pass this on to the graphics guys.
I am interested in seeing what optimizations the compiler is performing. For example, I have some code that says
X = (cos(a) - cos(b))*(cos(a) + cos(b)). Looking through the disassembly, I was surprised to discover that the VS 2010 C++ compiler generates four calls to cos for that code. I would have expected it to cache and reuse the results of the first two. I am moving this code into a kernel, and I am wondering what optimizations I can expect the kernel compiler to perform.
Alike Jerome I just would like to see the generated device dependent asm to get the chance to optimze manually, i.e. change the C code and compile again. Observing the LLVM code in the .ir file was already a good help.
It would be great to have a --gpu_disasm option for the compiler. AMD, NVIDIA have such already.
Best regards, Stephan