I've attached a test case (a gutted version of a segmented scan) that gets miscompiled by intel_sdk_for_ocl_applications_2013_xe_runtime_3.0.67279_x64, at least when running on my i7-2620M.
Specifically, when running this self-contained test code using PyOpenCL, I get the line
printed 16 times for each of the two work groups. If you look at the kernel, that means that the printf() in the trailing snippet:
if (get_local_id(0) == 0)
printf("gid:%d fsii:%d\n", psc_GID_0, psc_first_segment_start_in_interval);
got executed 16 times for group id 0. In my book, it should be executed exactly once. (Confirmed by running against other implementations. Intel OpenCL 2012 also gets this right.)