Bug report: ippiLUTPalette_8u_C1R

Bug report: ippiLUTPalette_8u_C1R

Hi, I have noticed a bug in ippiLUTPalette_8u_C1R function during my odd dimension image tests. I am using windows 64bit ipp v2018.3.210.

I have (w, h) = (1575, 1049), srcStep = 1664, dstStep = 1576, s_lut[256] is a uint8_t array defined in global scope.

Now calling

ippiLUTPalette_8u_C1R(srcData, srcStep, dstData, dstStep, IppiSize{w, h}, s_lut, 8);

results in wrong values except the first line of the image. It seems srcStep is larger than what this function can handle. On a side note, IppiCopy_8u_C1R works without any problem for the same data.

After I noticed that only the first row of the destination image is correct, I applied a simple workaround. Calling

for(int j = 0; j < h; j++)
    ippiLUTPalette_8u_C1R(srcData + j*srcStep, w, dstData + j*dstStep, w, IppiSize{w, 1}, s_lut, 8);

works without any problems and resulting values are as expected. Here I process row by row giving width as the step. I think step value is the problem.

On another note, I have checked the correctness of this function against simple C implementation as below

uint8_t *srcRow = srcData;
uint8_t *dstRow = dstData;

for(int j = 0; j < h; j++)
{
    for(int i = 0; i < w; i++)
    {
        dstRow[i] = s_lut[ srcRow[i] ];
    }
                        
    srcRow += srcStep;
    dstRow += dstStep;
}

Runtimes on Intel i7-8700K:

ippiLUTPalette_8u_C1R -> 980 us

Simple C loop byte-by-byte lookup -> 1070 us

Is this function really not optimized? or very unlikely but there are still some invalid operations even when applied row-by-row, so that it is becoming this slow? 1ms for going over the image only once is really too much. ippiCopy_8u_C1R on the exact same data takes 88 us (>11x faster).

publicaciones de 6 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.

Hi , 

 

have you tried to get status from the function? 

Would you try this and see what it says. ?

 status = ippiLUTPalette_8u32u_C1R( pSrc, srcStep, pDst, dstStep, roiSize, pTable, nBitSize);

 

and please refer this example page for the usage. 

https://software.intel.com/en-us/ipp-dev-reference-lutpalette-lutpalette...

 

also could you upload a reproducer for us that way we can actually check locally. 

 

 

Thank you 

Hi

Quick analysis shows the function shall be optimized. We will investigate this more deeply and return back with an answer about 2 weeks late.

Pavel

引文:

JON J K. (Intel) 写道:

Hi , 

 

have you tried to get status from the function? 

Would you try this and see what it says. ?

 status = ippiLUTPalette_8u32u_C1R( pSrc, srcStep, pDst, dstStep, roiSize, pTable, nBitSize);

 

and please refer this example page for the usage. 

https://software.intel.com/en-us/ipp-dev-reference-lutpalette-lutpalette...

 

also could you upload a reproducer for us that way we can actually check locally. 

 

 

Thank you 

 

Yes, that is the first thing I do, and the function returns StsNoErr.

Hi Alparslan Y.

I confirm your issue. Actually function uses the same srcStep for both source and destination images. We'll fix this issue in next release. Also function has optimization for nBitSize<=4. We consider if it is possible to add optimization for others bit size in next IPP releases.

Thanks for your feedback.

 

Thank you very much, looking forward to the next release.

Deje un comentario

Por favor inicie sesión para agregar un comentario. ¿No es socio? Únase ya