Crash in read_imagef with Volume CL_R CL_UNSIGNED_INT8

Crash in read_imagef with Volume CL_R CL_UNSIGNED_INT8

Hi,

Today taking a break from the HD4000, I decided to give the 2013 CPU SDK a try.

The latest version is reporting supporting the format I needed to I switch to CPU.

I am creating a volume with 8bit CL_R, CL_UNSIGNED_INT8 format.

Here is my code sampling the volume.

const sampler_t volume_trilinear_sampler = CLK_FILTER_LINEAR | CLK_NORMALIZED_COORDS_TRUE | CLK_ADDRESS_CLAMP_TO_EDGE;
inline float SampleRaw(__read_only image3d_t volume, float4 position)
{
    if ( position.x < 0 || position.y < 0 || position.z < 0 )
        return -1;
    if ( (int)position.x >= 1.0f || (int)position.y >= 1.0f || (int)position.z >= 1.0f )
        return -1;
    return read_imagef(volume,volume_trilinear_sampler,position).x;
}

Everything is built fine but when I run the kernel it crashes.

There are no error when creating the context, textures or kernel. It just crashes.

Now this produces some assembly code which seems to indicate that the read_imagef function is NULL.

EAX = 00000000 EBX = 21450600 ECX = 00000000 EDX = 00000000 ESI = 00000000 EDI = 00000000 EIP = 70040710 ESP = 24F542C0 EBP = 24F544FC EFL = 00000000

70040641 C5 E2 59 5E 58       vmulss      xmm3,xmm3,dword ptr [esi+58h]  
70040646 C5 E2 58 D2          vaddss      xmm2,xmm3,xmm2  
7004064A C5 E0 57 DB          vxorps      xmm3,xmm3,xmm3  
7004064E C5 F8 2E D9          vucomiss    xmm3,xmm1  //Test if  x < 0.0 
70040652 0F 87 81 01 00 00    ja          700407D9  
70040658 C5 F8 2E D8          vucomiss    xmm3,xmm0    //Test if  y < 0.0 
7004065C 0F 87 77 01 00 00    ja          700407D9  
70040662 C5 F8 2E DA          vucomiss    xmm3,xmm2    //Test if  z < 0.0 
70040666 0F 87 6D 01 00 00    ja          700407D9  
7004066C C5 FA 2C C1          vcvttss2si  eax,xmm1  
70040670 C5 FA 2A D8          vcvtsi2ss   xmm3,xmm0,eax  
70040674 C5 F8 2E 1D 08 00 05 70 vucomiss    xmm3,dword ptr ds:[70050008h]  //Test if > 1.0 
7004067C 0F 83 57 01 00 00    jae         700407D9  
70040682 C5 FA 2C C0          vcvttss2si  eax,xmm0  
70040686 C5 FA 2A D8          vcvtsi2ss   xmm3,xmm0,eax  
7004068A C5 F8 2E 1D 08 00 05 70 vucomiss    xmm3,dword ptr ds:[70050008h]   //Test if > 1.0 
70040692 0F 83 41 01 00 00    jae         700407D9  
70040698 C5 FA 2C C2          vcvttss2si  eax,xmm2  
7004069C C5 FA 2A D8          vcvtsi2ss   xmm3,xmm0,eax  
700406A0 C5 F8 2E 1D 08 00 05 70 vucomiss    xmm3,dword ptr ds:[70050008h]  //Test if > 1.0
700406A8 0F 83 2B 01 00 00    jae         700407D9  
700406AE C4 E3 71 21 C0 10    vinsertps   xmm0,xmm1,xmm0,10h  
700406B4 C4 E3 79 21 C2 20    vinsertps   xmm0,xmm0,xmm2,20h  
700406BA 8B 9C 24 F4 00 00 00 mov         ebx,dword ptr [esp+0F4h]  
700406C1 8B 73 14             mov         esi,dword ptr [ebx+14h]  
700406C4 8B BB 04 01 00 00    mov         edi,dword ptr [ebx+104h]  
700406CA 8D 84 24 E0 01 00 00 lea         eax,[esp+1E0h]  
700406D1 89 44 24 08          mov         dword ptr [esp+8],eax  
700406D5 8D 84 24 F0 01 00 00 lea         eax,[esp+1F0h]  
700406DC 89 44 24 04          mov         dword ptr [esp+4],eax  
700406E0 89 1C 24             mov         dword ptr [esp],ebx  
700406E3 C5 F8 54 05 60 00 05 70 vandps      xmm0,xmm0,xmmword ptr ds:[70050060h]  
700406EB FF 93 84 00 00 00    call        dword ptr [ebx+84h]  
700406F1 C5 F8 28 D0          vmovaps     xmm2,xmm0  
700406F5 C5 F8 28 8C 24 E0 01 00 00 vmovaps     xmm1,xmmword ptr [esp+1E0h]  
700406FE C5 F8 28 84 24 F0 01 00 00 vmovaps     xmm0,xmmword ptr [esp+1F0h]  
70040707 89 74 24 04          mov         dword ptr [esp+4],esi  
7004070B 89 1C 24             mov         dword ptr [esp],ebx  
7004070E FF D7                call        edi 

Cheers.

Laurent

5 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

Hi Laurent,

Can you also provide your host code - enough for us to reproduce the issue?

Thanks,

Raghu

Hi Raghu,

I think that this is going to be tricky. I might be able to provide the kernel only though as long as I send it in a PM.

Laurent.

Actually using read_imagef with CL_UNSIGNED_INT8 should result in an undefined behavior (great) according to the OpenCL doc.

I guess in that case it was a crash :)

Moving the texture to UNORM8 fixes the crash.

And it is actually the only way to get the linear filtering too since using UNSIGNED_INT8 forces read_imageui and that doesn't support linear filtering as far as I can see in the OpenCL documentation.

I really feel that the read_image functions definitions have been messed up badly by the OpenCL commity. Tons of undefined behavior and unsupported modes, not great at all for something that wants to be a standard.

Thank you Laurent. We will evaluate the value of this as a future feature request moving forward.

- Chuck

Faça login para deixar um comentário.