# How to use Parallel MKL from linux?

## How to use Parallel MKL from linux?

Folks,

Very new to parallel programming so need your help. I'm trying to solve a very large symmetric sparse general eigenvalue problem using extended eigen solver. I have no problem to do a smaller scale problem by one thread using "dfeat_scsrgv" subroutine. However, I have no clue on how to increase the speed by utilizing the parallel capability.

My system: Linux intel 64
Software: Intel ComposerXE 2013, mpich compiled by XE 2013

Here is what I did:

1. Compile: mpif90 -mkl=parallel -o test_mpi.x test_sparse_solver.f90
2. Run: mpiexec -np 8 ./test_mpi.x

The running itself was ok but my concern is if I really used the parallel capability. For a smaller problem with 2000 equations, using "-np 8" took longer time than "-np 1". I realized I might need to change the source code, but have no clue on where to start. Could you give me some quick reference to get it run parallelly? very much appriciate and thanks in advance.

Letian

Here is my source code: (FORTRAN 90)

!this routine test MKL sparse eigen solver
implicit real*8 (a-h,o-z)
real*8,allocatable::a(:),b(:)
integer,allocatable::cola(:),rowa(:),colb(:),rowb(:)
real*8,allocatable::e(:), x(:,:)
integer fpm(128)
real time_begin, time_end

m0=50
emin=0.0
emax=2e7
fpm=0

open(98,file='ifort98.dat',form='unformatted')
allocate (a(na),cola(na),rowa(n+1))
allocate (b(nb),colb(nb),rowb(n+1))
close(98)

call CPU_time(time_begin)

allocate (e(m0), x(n,m0))

call feastinit(fpm)
print*,fpm
call dfeast_scsrgv('U',n,a,rowa,cola,b,rowb,colb,fpm,epsout,loop,emin,emax,m0,e,x,m,res,info)

print*,'info=',info
print*,'m=',m
print*,'loop=',loop
print*,'epsout=',epsout

open(10,file='test.out')
do i=1,m
write(10,*) 'mode',i,' Freq=', sqrt(e(i))*0.5/3.1415926535897932
enddo
close(10)

deallocate (a,b,cola,rowa,colb,rowb,e,x)

call cpu_time(time_end)

print*,'Total CPU time=', time_end-time_begin
stop
end

3 posts / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Letian,

It looks a  big question.  I may suggest you to  start with MKL internal parallel.

As for most of case, MKL have explored the best parallel performance on multi-core based on your system configuration and problem size.  If you call threaded MKL library, your application will get parallel automatically.

For example, you may try  the pardiso  first to see the performance change with export MKL_NUM_THREADS=1/2/4/8,  also  with command

> ifort -mkl your.f90

>a.out

( I'm not sure how mpi process influence the MKL thread ,which is based on OpenMP)

Then if you really need parallelize your application yourself, you may need to learn all kind parallel method,  typically, OpenMP as

http://software.intel.com/en-us/forums/topic/487697