Incorrect Intel platform results as compared to nVidia/AMD

Incorrect Intel platform results as compared to nVidia/AMD

Hello,

I have three OpenCL platforms installed on my Ubuntu 10.04 amd64: nVidia, AMD and Intel. I noticed that my FFT kernel gives incorrect results on the Intel platform. I managed to reduce it to the following:

__kernel void test(__global float *out)

{

	float a[8];

	__local float smem[32];
	for(int q=0; q < 32; q++) { smem[q] = q; }

	barrier(CLK_LOCAL_MEM_FENCE);
	for(int q=0; q < 8; q++) { a[q] = 0; }
	int thread_id = get_local_id(0);
	if(thread_id < 16)

		a[0] = 0;
	a[0] = smem[thread_id];

	a[1] = 0;

	barrier(CLK_LOCAL_MEM_FENCE);
	smem[thread_id] = a[0];

	smem[thread_id + 16] = a[1];

	barrier(CLK_LOCAL_MEM_FENCE);
	if(thread_id < 16)

		out[thread_id] = a[0];
	a[0] = 0;

	a[1] = 0;

	a[2] = 0;

	a[3] = 0;

	a[4] = 0;

	a[5] = 0;

	a[6] = 0;
	if(thread_id < 0)

		out[0] = a[1];

}
Being executed with global size=16 and local size=16 it returns [0 .. 15] array on AMD (CPU) & nVidia (GPU), but [15 .. 15] on Intel (CPU). C and Python programs for reproduction are attached (C code is a bit messy and without error checking). The output I am getting:

Platform: NVIDIA CUDA
Device: Tesla C2050 / C2070
0.000000 1.000000 2.000000 3.000000 4.000000 5.000000 6.000000 7.000000 8.000000 9.000000 10.000000 11.000000 12.000000 13.000000 14.000000 15.000000
Platform: Intel OpenCL
Device: Intel Xeon CPU E5620 @ 2.40GHz
15.000000 15.000000 15.000000 15.000000 15.000000 15.000000 15.000000 15.000000 15.000000 15.000000 15.000000 15.000000 15.000000 15.000000 15.000000 15.000000
Platform: AMD Accelerated Parallel Processing
Device: Intel Xeon CPU E5620 @ 2.40GHz
0.000000 1.000000 2.000000 3.000000 4.000000 5.000000 6.000000 7.000000 8.000000 9.000000 10.000000 11.000000 12.000000 13.000000 14.000000 15.000000

Does anyone know what can be causing such behavior?

AttachmentSize
Downloadtext/x-csrc test.c4.75 KB
Downloadapplication/octet-stream test.cl582 bytes
Downloadtext/x-python test.py1.08 KB
2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi
Intel SDK currently doesnt support Ubuntu 10.04.
However we will look into your kernel.
Thanks, Shiri

Leave a Comment

Please sign in to add a comment. Not a member? Join today