Bug in OpenCL driver on Intel HD Graphics?

Bug in OpenCL driver on Intel HD Graphics?

Hello. 

I found a curious bug in Intel OpenCL driver on Intel HD Graphics 4600 (Core i5 4570). 

Driver version: 10.18.10.3412

Intel OpenCL SDK Version: 2013 R3 64-bit

I write following kernel:

    int gidx = get_global_id(0);
    int gidy = get_global_id(1);
    
    __local uchar temp[3];
    uchar r = 255;
    if (gidx == 0 && gidy == 0)
    {
        temp[0] = 255;
        temp[1] = 34;
        temp[2] = 255;
        for (int i=0; i<3; i++)
        {
            printf("temp[i] = %d\n", temp[i]);
            printf("r = %d\n", r);
            r = min(r,temp[i]);
        }
    }

When I run it I get:

temp[i] = 255
r = 255
temp[i] = 34
r = 15
temp[i] = 255
r = 15

Variable 'r' have incorrect value (15 instead 34). What may cause the problem?

 

10 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I tried reproducing this on a newer driver as well as the driver you mentioned without success. I am using the 2014 SDK though, but I do believe that this wont make any difference.

Is the code you posted the whole kernel or part of a bigger kernel? If its latter, then I suspect the bug is else where in your code.

Raghu

This code is the whole kernel. This is simplified case from Erode function in OpenCV library. And it correct works on other platforms (Nvidia, AMD, Intel CPU). 

If I define:

#define OP_MIN((A),(B)) (A) < (B) ? (A) : (B)

and replace min() for OP_MIN this example gives correct result (include Intel GPU), but works a little slower. 

I will try to install new version of Intel OpenCL SDK to check it.

Another strange detail: this error occurs only with min() function. 

Similar example with max() works correct:

    int gidx = get_global_id(0);
    int gidy = get_global_id(1);

    __local uchar temp[3];
    uchar r = 0;
    if (gidx == 0 && gidy == 0)
    {
        temp[0] = 0;
        temp[1] = 34;
        temp[2] = 0;
        for (int i=0; i<3; i++)
        {
            printf("temp[i] = %d\n", temp[i]);
            printf("r = %d\n", r);
            r = max(r,temp[i]);
        }
    }

Hi Raghu.

I checked this code with Intel OpenCL SDK Version 2014. And this error still occurs for me. 
What do you suggest for solving this issue?

Thanks,

Alexander.

 

Hi Raghu.

I still wait for your response. If it can help, here is a link to the original code of 'erode' function from OpenCV library - https://github.com/Itseez/opencv/blob/master/modules/imgproc/src/opencl/morph.cl

Thanks, 

Alexander.

Hi Alexander,

Did you update the driver or are you still using 10.18.10.3412?

Thanks,
Raghu

Hi Alexander,

I retested this on another driver and was able to reproduce the behavior you are seeing. I will open a bug and let you know when the fix is in.

Thanks,
Raghu

After further debugging, looks like r gets truncated to 4 bits. This issue is fixed/not reproducible in the internal drivers. I dont have any info on when the next driver will be posted, but I'll keep you updated.

Raghu

Hi Raghu,

I updated to the latest beta driver (10.18.10.3496).

Thanks for your help!

Alexander.

Leave a Comment

Please sign in to add a comment. Not a member? Join today