Compiler crash on byte swap function for ulong

Compiler crash on byte swap function for ulong

Hi, everybody!

I found a bit strange bug in OpenCL compiler for Intel HD4000 GPU (and maybe for all Intel GPUs).

Consider this kernel code:

ulong reverse( ulong x )
{
    ulong y = x >> 0x38;
    y |= ( ( x >> 0x28 ) & 0x000000000000FF00 );
    y |= ( ( x >> 0x18 ) & 0x0000000000FF0000 );
    y |= ( ( x >> 0x08 ) & 0x00000000FF000000 );
    y |= ( ( x << 0x08 ) & 0x000000FF00000000 );
    y |= ( ( x << 0x18 ) & 0x0000FF0000000000 );
    y |= ( ( x << 0x28 ) & 0x00FF000000000000 );
    y |= x << 0x38;
    
    return y;
}

ulong reverseHigh( ulong x )
{
    ulong y = x >> 0x38;
    y |= ( ( x >> 0x28 ) & 0x000000000000FF00 );
    y |= ( ( x >> 0x18 ) & 0x0000000000FF0000 );
    y |= ( ( x >> 0x08 ) & 0x00000000FF000000 );
    
    return y;
}

ulong reverseLow( ulong x )
{
    ulong y = x << 0x38;
    y |= ( ( x << 0x08 ) & 0x000000FF00000000 );
    y |= ( ( x << 0x18 ) & 0x0000FF0000000000 );
    y |= ( ( x << 0x28 ) & 0x00FF000000000000 );
    
    return y;
}

ulong rev( ulong x )
{
    return reverseHigh( x ) | reverseLow( x );
}

__kernel void func( __global ulong* input, __global ulong* output )
{
    uint gid = get_global_id( 0 );
    
    // those 2 functions doesn't compile
    output[ gid ] = reverse( input[ gid ] );
    //output[ gid ] = rev( input[ gid ] );
    
    // this code is work!
    //output[ gid ] = reverseHigh( input[ gid ] ) | reverseLow( input[ gid ] );
}

Compiler will crash with very strange error:

fcl build 1 succeeded.
error: Cannot yet select: 0x5e82160: i64 = bswap 0x5f26550 [ORD=5] [ID=25]
  0x5f26550: i64,ch = GHAL3DISD::LOAD64 0x5f262a8, 0x5f26440, 0x5f264c8 [ID=24]
    0x5f262a8: i32,ch = load 0x5e81d20:1, 0x5f26220, 0x5e82050<LD4[%2](align=8)> [ID=21]
      0x5e81d20: i32,ch = llvm.GHAL3D.get.global.id 0x5e81770, 0x5e81fc8, 0x5e81c98 [ORD=1] [ID=11]
        0x5e81770: ch = EntryToken [ORD=1] [ID=0]
        0x5e81fc8: i32 = Constant<77> [ID=5]
        0x5e81c98: i32 = Constant<0> [ORD=1] [ID=3]
// a lot more lines...

This happens if reverse() or rev() functions are used in kernel and doesn't happen, when using the last line of code, which in fact is inlining of rev() function.

Inline keyword doesn't help.

I test this code on my Asus notebook and I can't install Intel SDK. It complains for old OpenCL driver. I downloaded the latest beta driver from official site and copied Intel_OpenCL_ICD64.dll (ver. 2.0.0.0), IntelOpenCL64.dll (ver. 10.18.10.3652), Intel_OpenCL_ICD32.dll (ver. 2.0.0.0) and IntelOpenCL32.dll (ver. 10.18.10.3652) to System32 and SysWOW64 folders. The error remains.

Any help will be appreciated.

 

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Thanks for this report.  I've been able to reproduce at least some of the behavior you've described, and it certainly looks like it could be a bug.  We'll investigate and get back to you with more details.

Leave a Comment

Please sign in to add a comment. Not a member? Join today