Stack Overflow during running parallel FORTRAN code

Stack Overflow during running parallel FORTRAN code

Hi

I have a code in FORTRAN and it runs sequentially without problem (I compile it with /O3 and x64 platform). Then I add OpenMp syntaxes to make the code more optimized. This time it gives me "Stack overflow" message (even if I run it ). I increased stack reserve size to about 1GB but it does not work.

Here is part of the code that change to make it parallel: 

    call OMP_SET_NUM_THREADS(6);
    !$OMP PARALLEL DO DEFAULT(PRIVATE) SHARED(g_num,g_coord,nn,nels,anatyp)   &   
    !$OMP SHARED(coord_elm_center,loc_ele_cor,wix4,der4,fun4,Shear_Skeleton)  &
    !$OMP SHARED(EleMode,v,vu,Biot_Coef,c,dtim,permx,Kr,cT,gam,omg2,gcor8)    &
    !$OMP SHARED(wix8,eqn,counter1,counter2,counter3,counter4,counter5,Th_Exp)&
    !$OMP SHARED(lan,lan1,der8,fun8,gcor20,wix20,fun20,der20,gcor40,wix40)    &
    !$OMP SHARED(fun40,der40,gcor61,wix61,fun61,der61) SCHEDULE(DYNAMIC)      &
    !$OMP REDUCTION (+:Lhs,LhsSig,LhsU,A15,A25,A35,A45,A55_Heat)
    Main: do iel=1, nels
 $  DO SOEM CALCULATION

    enddo Main;
    !$OMP END PARALLEL DO 

I appreciate any help.

 

7 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Best Reply

You need to set the environment variable KMP_STACK_SIZE to a larger value (but not 1GB!) - this is the per-thread stack size. I suggest a somewhat lower stack reserve size - try 100000000 to begin with.

Retired 12/31/2016

Hi Lionel

Can you help me how to do that?

 

Best Reply

Hi,

I had this issue when I had a large, private, array. When entering the parallel zone a local copy was created on each threads stack causing the overflow.

My solution was to allocate the array dynamically before the parallel section, with an additional dimension, allocated to the number of threads. This can then be a shared array between all threads (ie no copy created on stack) where each thread accesses its own slice using the function OMP_GET_THREAD_NUM() + 1  (where the '+1' is because this function is zero based, not one based).

 

 

I had exactly the same problem recently, and solved it in a similar way as Michaael Roberts.

Chris G

Thank you Michael, Chris and Steve. I have increased KMP_STACKSIZE to 999M but It does not solve the problem. I think the best way is the way that Michael describes. I will do this and inform you. Thanks.

KMP_STACKSIZE (or, using the standard name, OMP_STACKSIZE) defaults to 4MB on Intel 64-bit targets.  A typical setting, when default isn't sufficient, is 9MB. When Steve said don't use 1GB I doubt he meant 999MB.  I haven't heard of any application where more than 40MB is required.  You ought to be able to estimate how much space is required for your private arrays by multiplying data size by number threads.

When you set KMP_STACKSIZE=999MB you risk adding 1GB times number of threads to the allowance you would require in /link /stack, which would put a low limit on number of threads. I don't know specifically for your platform, but I wouldn't count on being able to increase effective stack reserve to as much as 16GB (note that Steve suggested a more modest value).
 

Leave a Comment

Please sign in to add a comment. Not a member? Join today