Folks,

Very new to parallel programming so need your help. I'm trying to solve a very large symmetric sparse general eigenvalue problem using extended eigen solver. I have no problem to do a smaller scale problem by one thread using "dfeat_scsrgv" subroutine. However, I have no clue on how to increase the speed by utilizing the parallel capability.

My system: Linux intel 64

Software: Intel ComposerXE 2013, mpich compiled by XE 2013

Here is what I did:

1. Compile: mpif90 -mkl=parallel -o test_mpi.x test_sparse_solver.f90

2. Run: mpiexec -np 8 ./test_mpi.x

The running itself was ok but my concern is if I really used the parallel capability. For a smaller problem with 2000 equations, using "-np 8" took longer time than "-np 1". I realized I might need to change the source code, but have no clue on where to start. Could you give me some quick reference to get it run parallelly? very much appriciate and thanks in advance.

Letian

Here is my source code: (FORTRAN 90)

!this routine test MKL sparse eigen solver

implicit real*8 (a-h,o-z)

real*8,allocatable::a(:),b(:)

integer,allocatable::cola(:),rowa(:),colb(:),rowb(:)

real*8,allocatable::e(:), x(:,:)

integer fpm(128)

real time_begin, time_end

m0=50

emin=0.0

emax=2e7

fpm=0

open(98,file='ifort98.dat',form='unformatted')

read(98) n, na

allocate (a(na),cola(na),rowa(n+1))

read(98) (a(i),i=1,na)

read(98) (cola(i),i=1,na)

read(98) (rowa(i),i=1,n+1)

read(98) n, nb

allocate (b(nb),colb(nb),rowb(n+1))

read(98) (b(i),i=1,nb)

read(98) (colb(i),i=1,nb)

read(98) (rowb(i),i=1,n+1)

close(98)

call CPU_time(time_begin)

allocate (e(m0), x(n,m0))

call feastinit(fpm)

print*,fpm

call dfeast_scsrgv('U',n,a,rowa,cola,b,rowb,colb,fpm,epsout,loop,emin,emax,m0,e,x,m,res,info)

print*,'info=',info

print*,'m=',m

print*,'loop=',loop

print*,'epsout=',epsout

open(10,file='test.out')

do i=1,m

write(10,*) 'mode',i,' Freq=', sqrt(e(i))*0.5/3.1415926535897932

enddo

close(10)

deallocate (a,b,cola,rowa,colb,rowb,e,x)

call cpu_time(time_end)

print*,'Total CPU time=', time_end-time_begin

stop

end