I am writing an OpenCL kernel, which computes sad(Sum of absolute differences) of a 'char' vector. I understand that HD graphics 4000(and above) has a special sad2 and sada2 instructions. How to make sure that the OpenCL code that I write gets compiled into those sad2 or sada2 instructions? Because sad2 and sada2 behaves little differently that the OpenCL defined "absdiff" function.
And another related question, which has been asked in this forum before. Are there is any plans to support view the disassembly of HD graphics(or Iris Pro) GPUs for OpenCL kernel? Because I would like to validate the OpenCL code I wrote gets compiled into sad2 or sada2 instruction.