Weird bug on Intel OpenCL platform

Weird bug on Intel OpenCL platform

Attached is an OpenCL kernel that triggers an awkward bug in Intel's OpenCL platform. On the same CPU it works on the AMD OpenCL platform, but not on the Intel OpenCL platform. This has been verified on multiple independent machines (all windows 7, both 32-bit and 64-bit). I have also added a tiny Netbeans project (with JOCL embedded) to provide full code to reproduce the bug. But if you don't use Netbeans, it's easy to reproduce anywhere by just feeding an empty float[100] to the kernel and reading it afterwards.

What the program does is simply executing the same program 4 times. What should happen is obviously that you get always the same output. However, on the Intel platform, somehow suddenly the first execution yields a different result. Always consequently the first, always consequently the same wrong output, on different machines. But the kernel itself is pretty trivial and has no memory at all over different executions.

I would classify this as a severe bug, since it doesn't just crash, but instead provides erroneous output without any notice!

AttachmentSize
Download intelbug.zip1.43 MB
Download bugintel.cl.txt3.24 KB
7 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I am trying to reproduce your issue and installed NetBeans and compiled the source files. It has been several years since I worked with Java so I am a bit rusty. I'll file a bug once I am able to debug it further. I will keep you posted.

Thanks,
Raghu

Thanks. Some info that I just now realize may be relevant, on my system, platform 0 is AMD and platform 1 is Intel.
So, in JOCL.java on line 154, you may have to replace "final int platformIndex = 1;" by "final int platformIndex = 0;" if only the Intel platform is installed. Probably also if the Intel platform was installed first.

Yup. I already made that change. I also realized that the size of the Lclassic array doesn't have to be 100. Even an array of size 1 produces different results. I am looking at your kernel to understand what you are doing. Just making sure...you get the same result for each iteration on other platforms right?

What I'm doing is randomly generating numbers and filling the array with it. I'm using http://cas.ee.ic.ac.uk/people/dt10/research/rngs-gpu-mwc64x.html with a small modification to soft-code the MWC64X_A variable, to avoid a bug in AMD's OpenCL compiler.

Quote:

Raghupathi Muthyalampalli (Intel) wrote:
Just making sure...you get the same result for each iteration on other platforms right?

Yes. On AMD platform, the output is always the same, for both CPU and GPU. On Intel, the first row is different:

[0.0, 0.0, 0.0, 0.0, 0.5441741, 0.6534582, 0.9902577, 0.5279418, 0.7744405, 0.3976747, 0.734416, 0.0017434603, 0.22150545, 0.82592,
[0.13983674, 0.39817378, 0.7858866, 0.90109515, 0.5441741, 0.6534582, 0.9902577, 0.5279418, 0.7744405, 0.3976747, 0.734416,
[0.13983674, 0.39817378, 0.7858866, 0.90109515, 0.5441741, 0.6534582, 0.9902577, 0.5279418, 0.7744405, 0.3976747, 0.734416,
[0.13983674, 0.39817378, 0.7858866, 0.90109515, 0.5441741, 0.6534582, 0.9902577, 0.5279418, 0.7744405, 0.3976747, 0.734416,

Can you let me know what version of the CPU runtime you have? I have verified that the later version (I am using an internal version) of the CPU runtime outputs correct values consistently.

Thanks,
Raghu

Quote:

Raghupathi Muthyalampalli (Intel) wrote:

Can you let me know what version of the CPU runtime you have? I have verified that the later version (I am using an internal version) of the CPU runtime outputs correct values consistently.

Thanks,
Raghu


I used the most recent version at the time of my initial post. I'm glad to hear that Intel has solved the problem in their newest internal version. When will this internal version be ready for download?

Leave a Comment

Please sign in to add a comment. Not a member? Join today