Pardiso performance 10.1.1 vs 11.0.2

Pardiso performance 10.1.1 vs 11.0.2

We have recently upgraded to version 11.0.2 from version 10.1.1. We are finding that pardiso is typically around 10% slower in 11.0.2 than it was previously. The matrix is not too large for this problem (~2000 rows). Is this to be expected or should I send an example?

15 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

Hello,

If you notice the performance is better in mkl version 10.1.1, then we should investigate further. Please send me an example testcase and I'll escalate the issue to engineering team after I reproduce the issue

Thank you,

Sridevi

Sridevi Allam Technical consulting engineer - Intel MKL

I have attached a visual studio project that I have been using to test this. It reads in a matrix and right hand side vector in and then calls pardiso a number of times to get an accurate timing result. This is then repeated 10 times to check it is reproduced. The only thing to change apart from the dlls between versions is the name of the function for getting the version string.

This test reproduces the slow down we have been seeing in our actual software.

Regards,

Euan

Anexos: 

AnexoTamanho
Download mklpardisotest.zip80.02 KB

Also here is the typical output for the two versions running on my machine.

Anexos: 

Thanks for the test project and screenshots. I see that performance decrease is around 10% and it really needs to be investigated.

Did you try to execute tests with 64-bit versions of MKL?

I've just tried it and there is still a decrease in 64-bit although it appears to be smaller ~5%. I've attached screenshots of the results.

Thanks for investigating this.

Anexos: 

I did reproduce the problem and I'm submiting a ticket for this to Engineering Team.I'll update you the status/comments on the issue

Thanks,

Sridevi

Sridevi Allam Technical consulting engineer - Intel MKL

Hi Euan,

Quick investigation of your code shows that you call PARDISO with phase 13 in loop without any memory release. It's an incorrect pardiso using that could on small matrix allow to behavior you mentioned. Could you check performance results with correct using of pardiso?

With best regards,

Alexander Kalinkin

Hi Alexander,

Your suggestion doesn't fix the problem. You are right that I should be releasing the memory. In our production code we do, but I forgot this when creating the test program. Releasing the memory does improve performance, but by a similar amount in both versions so the performance drop is still there. I've attached the new results. I changed the call in the loop to do the following.

phase = 13;
                IntelMathLibrary.Pardiso(
                    pt,
                    ref maxfct,
                    ref mnum,
                    ref mtype,
                    ref phase,
                    ref numberOfRows,
                    nonZeroValues,
                    rowIndices,
                    columnIndices,
                    perm,
                    ref nrhs,
                    pardisoIparam,
                    ref msglvl,
                    rhs,
                    result,
                    ref error);
                phase = -1;
                IntelMathLibrary.Pardiso(
                    pt,
                    ref maxfct,
                    ref mnum,
                    ref mtype,
                    ref phase,
                    ref numberOfRows,
                    nonZeroValues,
                    rowIndices,
                    columnIndices,
                    perm,
                    ref nrhs,
                    pardisoIparam,
                    ref msglvl,
                    rhs,
                    result,
                    ref error);

Regards,

Euan

Anexos: 

Is there any further news on this problem?

Regards,

Euan

Is this problem still being investigated? It has been quite some time with no replies.

Regards,

Euan

>>...Is this problem still being investigated? It has been quite some time with no replies...

Please upload the latest version of your test case. I understood that you've made some modifications and it didn't fix the problem. Thanks.

Hi Sergey

I have attached the modified code. The only change was to add the clearing of the memory into the loop following Alexander's suggestion as discussed above.

Thanks,

Euan

Anexos: 

AnexoTamanho
Download mklpardisotestv2.zip70.19 KB

Hi Euan, Thank you and sorry that it takes so long. This is already 4-month-old thread and I think everybody needs to refresh status of the problem.

I escalated this issue to our engineering team and our engineers have provided some comments:

We reproduced this issue on releases noticed in the title of this issue. We spent a lot of effort to improve both performance and stable of pardiso code in last years by modification of reference code.  Because PARDISO work with sparse matrix and its performance depend on pattern of matrix from release to release we could obtain performance degradation on several matrices. The number of this matrix small and number of cases with performance degradation need to be smaller than number of tests case with performance improvement. Moreover, number of test with performance degradation about 5 percent needs to be lower than 5 percent of overall number of tests. Additional requirement – no performance degradation on test that was provided by customer

Can you please provide your benchmark.  In spite of all above requirements sometimes we obtain example of matrices with performance degradation between distant releases. This issue is a member from this sequences. We see the degradation but can’t improve situation because of huge changing in correspondent code. Can you please provide way of using pardiso in your code to try find out a workaround of this issue. 

Sridevi Allam Technical consulting engineer - Intel MKL

Faça login para deixar um comentário.