FEAST sparse with parallelism error

FEAST sparse with parallelism error

Intel MKL 11.1 update 3

In some cases, using feast with a sparse matrix, dfeast_scsrgv(), the program can't find any eigenvalue. I've run the same problem with same data and parameters in many computers (32bits, 64bits, Intel CPUs, AMD CPUs...), and only works well when:

  1. The CPU has only one core
  2. The CPU has many cores but I set MKL_NUM_THREADS=1

For example, this is a piece of output when it works:

Intel MKL Extended Eigensolvers: double precision driver
Intel MKL Extended Eigensolvers: List of input parameters fpm(1:64)-- if different from default
Intel MKL Extended Eigensolvers: fpm(1)=1
Intel MKL Extended Eigensolvers: fpm(3)=11
Search interval [4.025678249387654e-004;6.032205425995101e-002]
Intel MKL Extended Eigensolvers: Size subspace 100
#Loop | #Eig  |    Trace     | Error-Trace |  Max-Residual
Intel MKL Extended Eigensolvers: Resize subspace 10
0,3,8.152634965556513e-002,1.000000000000000e+000,8.551406370652709e-001
1,3,8.152638558335801e-002,5.955996247212547e-007,4.938321500736022e-010
2,3,8.152638558335654e-002,2.438652870290242e-014,4.629527481738191e-010
3,3,8.152638558338894e-002,5.371938162384636e-013,1.005998303916379e-009
4,3,8.152638558343898e-002,8.296020990817558e-013,5.923173827457810e-010
5,3,8.152638558343150e-002,1.240031978383434e-013,4.955312287793431e-010
6,3,8.152638558343395e-002,4.049084011047948e-014,4.994981329528280e-010
7,3,8.152638558346401e-002,4.983134072687419e-013,1.094808792578344e-009
8,3,8.152638558338488e-002,1.311811194942921e-012,6.506140369093765e-010
9,3,8.152638558339712e-002,2.029143237354711e-013,6.532719755830238e-010
10,3,8.152638558342154e-002,4.049084011047949e-013,6.415808120734978e-010
11,3,8.152638558347081e-002,8.167186499556941e-013,4.743023834809335e-010
12,3,8.152638558343139e-002,6.533749199645554e-013,6.517576568383395e-010
13,3,8.152638558346884e-002,6.207061739663276e-013,3.966558969833090e-010
14,3,8.152638558345569e-002,2.178683271853641e-013,3.422892146485306e-010
15,3,8.152638558344005e-002,2.592794136619908e-013,6.560782807951622e-010
16,3,8.152638558346825e-002,4.674851540028087e-013,6.617541587530926e-010
17,3,8.152638558343357e-002,5.749239172505014e-013,9.496449710626763e-010
18,3,8.152638558347872e-002,7.483903572692600e-013,8.118187669387088e-010
19,3,8.152638558347403e-002,7.776081793944356e-014,8.062934740915683e-010
20,3,8.152638558344921e-002,4.113501256678257e-013,4.731713012698910e-010
Intel MKL Extended Eigensolvers have successfully converged (to desired tolerance).

 

This is the same piece of output when it doesn't work:

Extended Eigensolvers: double precision driver
Extended Eigensolvers: List of input parameters fpm(1:64)-- if different from default
Extended Eigensolvers: fpm(1)=1
Extended Eigensolvers: fpm(3)=11
Search interval [4.025678249387654e-004;6.032205425995101e-002]
Extended Eigensolvers: Size subspace 100
#Loop | #Eig  |    Trace     | Error-Trace |  Max-Residual
Extended Eigensolvers: Resize subspace 10
Extended Eigensolvers WARNING: No eigenvalue has been found in the proposed search interval
==>INFO code =: 1

 

Any suggestion?

Luis Gonzalez Torquemada
Arquitecto
5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Can you give the cpp of fortran example which we can compile and check the problem? 

Hello, Gennady:

In the zip file Attached, you can find:

  • EigenSparseSolver.cpp and EigenSparseSolver.h to buil TestProgram.exe
  • Gccomp.TMP, Lcomp.TMP and Masa.TMP are input data for TestProgram.exe
  • RunProgram.bat runs program with apropiate command-line parameters. You must modify the parametes "Full Path of Directory Data" with the correct directory in were data files are placed.

Thanks!

 

Fri,

Attachments: 

AttachmentSize
Download TestEigenSparseSolver.zip90.96 KB
Luis Gonzalez Torquemada
Arquitecto

thanks, we will check the problem

It seems the problem is caused by ill-conditioning the input matrix. For mitigating the problem - please try to use CTR mode it should help.

for example, on my side, when I set MKL_NUM_THREADS=4 and then set MKL_CBWR=AVX ( my CPU supports this instruction set),

I see the probelm has been resolved. here is the output :

test.exe FEAST 7140 C:\_tmp\_Forums\u515045\  30 0
Arg[0 = Path    ]: <_1.exe>
Arg[1 = Fun.    ]: <FEAST>
Arg[2 = ngdl    ]: <7140>
Arg[3 = Est.    ]: <C:\_tmp\_Forums\u515045\>
Arg[4 = N modos ]: <30>
Arg[5 = hMesg   ]: <0>
Estructura grande: 786 masas no nulas
Intel MKL Extended Eigensolvers: double precision driver
Intel MKL Extended Eigensolvers: List of input parameters fpm(1:64)-- if differe
Intel MKL Extended Eigensolvers: fpm(1)=1
Intel MKL Extended Eigensolvers: fpm(3)=11
Search interval [4.025678249387654e-004;6.032205425995101e-002]
Intel MKL Extended Eigensolvers: Size subspace 100
#Loop | #Eig  |    Trace     | Error-Trace |  Max-Residual
Intel MKL Extended Eigensolvers: Resize subspace 9
0,3,8.152662643466646e-002,1.000000000000000e+000,1.401705246616198e+000
1,3,8.152638558345735e-002,3.992755420392927e-006,8.464923649033275e-010
2,3,8.152638558341449e-002,7.104301946656856e-013,7.273199218427829e-010
3,3,8.152638558337148e-002,7.129608721725905e-013,6.578973692540981e-010
4,3,8.152638558346656e-002,1.576151963618721e-012,7.294627981078312e-010
5,3,8.152638558349780e-002,5.178686425493712e-013,5.309863801850973e-010
6,3,8.152638558342526e-002,1.202531938962933e-012,5.290235569311135e-010
7,3,8.152638558341960e-002,9.386512934702063e-014,7.919877550439916e-010
8,3,8.152638558347067e-002,8.466266568554801e-013,7.991677130398161e-010
9,3,8.152638558344874e-002,3.634973146281681e-013,6.607200006840905e-010
10,3,8.152638558341199e-002,6.092030943894868e-013,4.556686457271172e-010
11,3,8.152638558344574e-002,5.595097906175348e-013,8.606599954732313e-010
12,3,8.152638558346162e-002,2.631904607181167e-013,5.577805973475511e-010
13,3,8.152638558348506e-002,3.885740281056810e-013,5.683167330868062e-010
14,3,8.152638558342930e-002,9.243874747949237e-013,5.940228760491368e-010
......................

/Gennady

 

Leave a Comment

Please sign in to add a comment. Not a member? Join today