Compiler crash on byte swap function for ulong

Compiler crash on byte swap function for ulong

Hi, everybody!

I found a bit strange bug in OpenCL compiler for Intel HD4000 GPU (and maybe for all Intel GPUs).

Consider this kernel code:

ulong reverse( ulong x )
{
    ulong y = x >> 0x38;
    y |= ( ( x >> 0x28 ) & 0x000000000000FF00 );
    y |= ( ( x >> 0x18 ) & 0x0000000000FF0000 );
    y |= ( ( x >> 0x08 ) & 0x00000000FF000000 );
    y |= ( ( x << 0x08 ) & 0x000000FF00000000 );
    y |= ( ( x << 0x18 ) & 0x0000FF0000000000 );
    y |= ( ( x << 0x28 ) & 0x00FF000000000000 );
    y |= x << 0x38;
    
    return y;
}

ulong reverseHigh( ulong x )
{
    ulong y = x >> 0x38;
    y |= ( ( x >> 0x28 ) & 0x000000000000FF00 );
    y |= ( ( x >> 0x18 ) & 0x0000000000FF0000 );
    y |= ( ( x >> 0x08 ) & 0x00000000FF000000 );
    
    return y;
}

ulong reverseLow( ulong x )
{
    ulong y = x << 0x38;
    y |= ( ( x << 0x08 ) & 0x000000FF00000000 );
    y |= ( ( x << 0x18 ) & 0x0000FF0000000000 );
    y |= ( ( x << 0x28 ) & 0x00FF000000000000 );
    
    return y;
}

ulong rev( ulong x )
{
    return reverseHigh( x ) | reverseLow( x );
}

__kernel void func( __global ulong* input, __global ulong* output )
{
    uint gid = get_global_id( 0 );
    
    // those 2 functions doesn't compile
    output[ gid ] = reverse( input[ gid ] );
    //output[ gid ] = rev( input[ gid ] );
    
    // this code is work!
    //output[ gid ] = reverseHigh( input[ gid ] ) | reverseLow( input[ gid ] );
}

Compiler will crash with very strange error:

fcl build 1 succeeded.
error: Cannot yet select: 0x5e82160: i64 = bswap 0x5f26550 [ORD=5] [ID=25]
  0x5f26550: i64,ch = GHAL3DISD::LOAD64 0x5f262a8, 0x5f26440, 0x5f264c8 [ID=24]
    0x5f262a8: i32,ch = load 0x5e81d20:1, 0x5f26220, 0x5e82050<LD4[%2](align=8)> [ID=21]
      0x5e81d20: i32,ch = llvm.GHAL3D.get.global.id 0x5e81770, 0x5e81fc8, 0x5e81c98 [ORD=1] [ID=11]
        0x5e81770: ch = EntryToken [ORD=1] [ID=0]
        0x5e81fc8: i32 = Constant<77> [ID=5]
        0x5e81c98: i32 = Constant<0> [ORD=1] [ID=3]
// a lot more lines...

This happens if reverse() or rev() functions are used in kernel and doesn't happen, when using the last line of code, which in fact is inlining of rev() function.

Inline keyword doesn't help.

I test this code on my Asus notebook and I can't install Intel SDK. It complains for old OpenCL driver. I downloaded the latest beta driver from official site and copied Intel_OpenCL_ICD64.dll (ver. 2.0.0.0), IntelOpenCL64.dll (ver. 10.18.10.3652), Intel_OpenCL_ICD32.dll (ver. 2.0.0.0) and IntelOpenCL32.dll (ver. 10.18.10.3652) to System32 and SysWOW64 folders. The error remains.

Any help will be appreciated.

 

2 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Thanks for this report.  I've been able to reproduce at least some of the behavior you've described, and it certainly looks like it could be a bug.  We'll investigate and get back to you with more details.

登陆并发表评论。