I have a new 'strange' behavior of a OpenCL kernel using the Xeon Phi.
In this case, I have a small example written in HPL that executes perfectly in CPU, GPU but not in XEON PHI.
I've attached the example in a .cpp file. You can download HPL library to test it or you can reproduce it with OpenCL (If you need the OpenCL code, please ask me). The problem is in the following loop: