I tried to write a simple code to repeatedly compute greek pi by simulation and then compare the performance of the serial vs parallelized version of the code. To my great surprise the parallel code was slower ! Since I am a beginner I suspect I am not grasping some key aspects of parallel programming. Below I report the whole code. I am working with a version of an Intel 6700 processor with 4 cores.
I don't know if this forum is for this kind of questions, but thanks in advance for any help youc an give me.
! This program computes the value of greek pi "n" times using simulation
! Each time the computation is performed using "m" draws
! The computation is carried out by the subroutine "montec"
! In the end the average of th n simulations is computed and printed on screen
double precision greekpi(n),outp,avpi,den
double precision start_time,end_time
!$omp parallel private(i)
nthreads = omp_get_num_threads()
print*, 'number of threads',nthreads
!$omp do schedule(dynamic,chunk)
do i = 1,n
greekpi(i) = outp
outp = 0.0d0
!$omp end do
!$omp end parallel
print*, 'average value of greek pi'
den = n
avpi = sum(greekpi)/den
print*, 'running time'
print*, end_time - start_time
double precision sol
double precision xr1,xr2,yv(ndr),sumsq,totins,tot
totins = 0.0d0
do i = 1,ndr
sumsq = xr1**2.0d0 + xr2**2.0d0
if (sumsq.le.1.0d0) then
totins = totins + 1.0d0
tot = ndr
sol = totins/tot
sol = 4.0d0*sol