ippsConvolve_32f take about 5 times longer on AVX2 compared to AVX.

ippsConvolve_32f take about 5 times longer on AVX2 compared to AVX.

Hello All,

We measured time that takes to perform ippsConvolve_32f on i5-4402E processor and have seen that ippsConvolve_32f takes about 5 times longer when we using avx2 compared to avx.

We tried to use ippsConv_32f instead of ippsConvolve_32f  and get the same results. We tried possible convolution algorithms (ippAlgAuto, ippAlgDirect and ippAlgFFT) and have seen that using ippAlgAuto and using ippAlgDirect gives the same result (using avx and using avx2).

When we try to use ippAlgFFT in avx we get little performance decrease and in avx2 we get performance increase compared to ippAlgAuto in avx2 but still take more time then avx ippAlgAuto.

The times we get in microSec:        AVX                  AVX2

ippAlgAuto, ippAlgDirect:                4                        27

ippAlgFFT                                     5                         5

So it's seems to be bug in ippsConvolve_32f for ippAlgFFT when using avx2.

avx2 should be more faster then avx for each algorithm but we see that for ippAlgFFT there is no improvement and for ippAlgDirect the performance is critically decreased.

Notes:

We are using static linkage (#include <ipp_h9.h> for avx2 and <ipp_g9.h> for avx before #include <ipp.h>).

Thank you,

Itzhak

12 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Hi Itzhak,

What is the vector lenght are you testing?  could you provide us the test code so we can do the test quickly?

Best regards,
Ying

Hi Itzhak,

Sorry for the delay.  Yes, our developer catch the bug, there is a performance degradation in last IPP releases. We will fix it in nearest release. I will notify you when it is ready.

Thanks
Ying

Hi Ying,

Thank you that inform me.

Regards,

Itzhak

Hello Ying,

If the fix to bug released?

We found the same bug on old version of IPP. The version is 7.0.1.104. The CPU is i7-620LE. We are using static linkage (#include <ipp_p8.h> before #include <ipp.h>).The function that we use there is IppsConv_32f(..) and we have seen that it take about 5 times more on some cases.

My question is if there is some workaround to solve the problem with this function or we must to use the latest ipp? If the latest ipp will solve the problem?

Regards,

Itzhak

Hi Itzhak,

I'm checking with our developer to see if there are any workaround. .

Yes, geneally, we fix the problem in new release. You mentioned, found same bug on old version of IPP, 7.0.104  i7-620LE. do you mean p8 code vs which code?

Best Regards

Ying

 

Hi Ying,

Is diffucult to answer you question but I will try. We have 2 applications (old and new) with the same ipp code. I mean that the difference between old and new applications is our code but IPP code is the same. In each application we have different modes to work. So in old application all the modes work as expected  I mean it takes reasonable time to proceed. In new application in one mode the CPU is 100% load. We researh the code and found that the IppsConv_32f(..) function is takes not reasonable time to run. But we suceed to solve the problem by upgrading current ipp version we use ( 7.0.104) to 8.0.0.083 version without changing the code at all.

Regards,

Itzhak

Quote:

Ying H (Intel) wrote:

Hi Itzhak,

I'm checking with our developer to see if there are any workaround. .

Yes, geneally, we fix the problem in new release. You mentioned, found same bug on old version of IPP, 7.0.104  i7-620LE. do you mean p8 code vs which code?

Best Regards

Ying

 

If the version that solve ippsConvolve_32f() function is released or will be released in future?

Hi  Itzhak,

Sorry for the delay.  I have checked, there is not quick workaround.  As the nearest IPP release (IPP 8.1 or update 2 ) was scheduled to early of 2014,  the solution will be released at that time.

Best Regards,

Ying

Hi Itzhak,

I'm glad to notify you that the fix version should be availiable.  The latest one is IPP 8.2.  You are welcomed to try it.  The install pacakge can be download from https://registrationcenter.intel.com/regcenter/register.aspx as before.

Best Regards,

Ying

Is it fixed as part of Intel Composer 2015?

yes, IPP 8.2 is part of Composer 2015

发表评论

登录添加评论。还不是成员?立即加入