First of all, thanks for releasing a new version of the Intel OpenCL toolkit! I've downloaded the new version and have found two issues:
- First, the compiler is much slower than it used to be, say, on 2012. This is not bad in itself. What makes it bad is that compiling from binary appears to take the same amount as from source. As a result, developers are stuck waiting for kernels to compile every time. (i.e. binary caching is impossible) This gets old quickly...
- The PyOpenCL (in git, http://github.com/inducer/pyopencl ) test suite fails in the segmented scan. Since this code runs successfully on AMD (CPU, APU, GPU), Nvidia, and Intel 2012, I am currently leaning towards there being a correctness issue in 2013. I'll continue to investigate though.
I'd appreciate your feedback on these issues.