Incorrect values returned for IPP SAD computation using ippiSAD8x8_16u32s_C1R

Incorrect values returned for IPP SAD computation using ippiSAD8x8_16u32s_C1R

Imagen de Greg

Incorrect values are often returned when using the IPP function ippiSAD8x8_16u32s_C1R() to compute an 8x8 SAD for 16 bit video. Video that is 15 bit or less appears to work correctly. The maximum possible 8x8 SAD value for 15 bit video is (215 -1) * 82. Incorrect values are returned once the SAD value becomes greater than the maximum possible value for 15 bit. A list of example values is attached and the source code used to generate them is listed below.

 In the source code that follows edit the values of mainVal and addVal to test different SAD sizes. The final SAD value should equal (mainVal * 82) + addVal.

The equivalent function for 4x4 SAD appears to have the same problem. Also, I am using IPP version 7.0.

void ippiSAD_test()

{

    Ipp16u cur[64], ref[64];

    Ipp16u *pCur = cur;

    Ipp16u *pRef = ref;

    I32 curStep = 8;

    I32 refStep = 8;

    Ipp32s isad = 0;

    I32 csad = 0;

    // SAD = (mainVal * 8 * 8) + addVal

    Ipp16u mainVal = 65535;

    Ipp16u addVal = 0;

 

    // set image pixel values

    for (I32 i = 0; i < 8; ++i) {

        for (I32 j = 0; j < 8; ++j) {

            pCur[i*8+j] = mainVal;

            pRef[i*8+j] = 0;

        }

    }

    pCur[0] += addVal;

 

    IppStatus stat;

    stat = ippiSAD8x8_16u32s_C1R(pCur, curStep*2,

                                 pRef, refStep*2,

                                 &isad, IPPVC_MC_APX_FF);

    ASSERT_TRUE(ippStsNoErr == stat);

 

    for (I32 j = 0; j < 8; ++j) {

        Ipp16u *p1 = &pCur[j * curStep];

        Ipp16u *p2 = &pRef[j * refStep];

        for (I32 k = 0; k < 8; ++k) {

            csad += abs(p1[k] - p2[k]);

        }

    }

 

    printf("IPP SAD:      %d\n", isad);

    printf("COMPUTED SAD: %d\n", csad);

 

    ASSERT_TRUE(isad == csad);

}

 

AdjuntoTamaño
Descargar table.png5.81 KB
Descargar ippisad-test.cpp1.04 KB
publicaciones de 8 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.
Imagen de Gennady Fedorov (Intel)

Gregory, thanks for the report. I see the same results with  the 7.1.1version. we will check the reasons of this problem.

Imagen de Gennady Fedorov (Intel)

Gregory, I don't see the problem with the latest 8.0 version: 

Ipp16u mainVal = 50000;

ippIP PX (px) 8.0.0 (r40040) 8.0.0.40040
IPP SAD: 3200000
COMPUTED SAD: 3200000
Press any key to continue . . .

Ipp16u mainVal = 65535;

ippIP PX (px) 8.0.0 (r40040) 8.0.0.40040
IPP SAD: 4194240
COMPUTED SAD: 4194240
Press any key to continue . . .

Imagen de Gennady Fedorov (Intel)

Gregory, actually the code you provided, works fine only when static code initialized for SSE code only. 

ippStaticInit(); IppCpuType cputype = ippCpuSSE; // PASSED

for all others cases - the problem still persists with the latest 8.0 version too. The problem is escalated. We will inform you as soon as the problem will be fixed.

thanks

Imagen de Greg

Gennady, thanks for the quick response. I don't think the static solution will work for my application. I have come up with a work around for now, but will revert back to the IPP call once the problem is fixed.

Thanks for your help. 

Imagen de Sergey Khlystov (Intel)

Hi Greg,

Could you please explain a bit - in general - why static solution won't fit your needs? We would like to know what needs to be improved in static libs.

Regards,
Sergey 

Regards, Sergey
Imagen de Greg

Sergey,

The solution Gennady provided will not work because the most optimized instruction sets are required to meet specific performance criteria.

Thanks,
Greg

Imagen de Greg

Just as a general note, this bug is no longer an issue for me. The problem only occurs when processing full 16 bit images. Currently, the application I am using it for only requires support for 10 bit. I'd imagine processing full 16 bit images is a rare case and probably why this bug has gone unnoticed. 

Greg

Inicie sesión para dejar un comentario.