OpenMP limit

OpenMP limit

Portrait de Roman

Hi,

Does anyone know if there is a limit to the size of the reduction variable in OpenMP?

When I compile and run the attached simple program, it always crashes with a stack overflow.  I have tried setting the stack to 1GB (/STACK:1073741824).  I have tried using the heap (/heap-arrays0).  And I have tried calling KMP_SET_STACKSIZE_S.  Nothing worked for me.

The program works as expected, if I make the size of x smaller.

Roman

Fichier attachéTaille
Téléchargement test-omp-stack.f90842 octets
14 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.
Portrait de Tim Prince

If you're controlling the thread stack, it's done by environment variable  e.g.  KMP_STACKXIZE=8m or by library call.

The global stack can be set by option e.g.  /link /stack:800000000 or editbin.

/heap-arrays with a number affects only stack allocations of size known at compile time.  I remember Steve Lionel recommending it without a number.

Portrait de John Campbell

I think there could be some problem with " x = x + 1.0"

Try the attached changes with/without OMP and see what works. The K loop works for my case without OMP
Stack problems should not be associated with ALLOCATE but could be associated with temporary arrays for the array instructions.

John

Fichiers joints: 

Fichier attachéTaille
Téléchargement test-omp-stack.f901.19 Ko
Portrait de Sergey Kostrov

>>...Does anyone know if there is a limit to the size of the reduction variable in OpenMP?

You could try to set a lower value for OpenMP stack size at runtime. For example, to OMP_STACKSIZE=128K, or so.

Another questions are do you use 32-bit or 64-bit platform and how much memory is installed on your computer?

Portrait de Sergey Kostrov

>>...The global stack can be set by option e.g. /link /stack:800000000...

800000000 =~ 763MB and it looks to much even for a 64-bit platform.

Portrait de NotThatItMatters

Global stack size, a Windows limitation, is limited to 2 Gb.  The best I have ever been able to do is use 268435456, which is 2 ^ 28.  It does not matter, Win32 or X64.  As I have been struggling mightily with the app I create, the reduction of stack size is a real headache in older code without COMMON but with huge argument lists for routines.

Portrait de jimdempseyatthecove

>> It does not matter, Win32 or X64

If your app is build as 32-bit app, and
If you run on system with 4 hardware threads, and
If you specify 1GB stack, and
If you launch OpenMP with default settings (4 threads),
Then each thread of the app will attempt to obtain 1GB (of the 2GB or 3GB available in VM)
(i.e. app requires 4GB of total stack)

>> reduction of stack size is a real headache in older code without COMMON but with huge argument lists for routines.

1) Use /heap-arrays for ifort
2) Assure that these old routines are compiled with the -openmp opton even though they do not have OpenMP statements.

Jim Dempsey

www.quickthreadprogramming.com
Portrait de Sergey Kostrov

>>...Global stack size, a Windows limitation, is limited to 2 Gb...

It is still Not clear what platform Roman is using and I expect this is a 32-bit Windows platform.

Portrait de Steve Lionel (Intel)

A couple of comments.  NotThatItMatters is correct that Windows, both 32-bit and 64-bit, limits the stack to less than 2GB. In fact, I usually cite 1GB as the upper limit.

Jim, I would normally defer to you on OpenMP issues, but I think you went a bit astray here. The stack limit one sets in the linker is for the process. Thread stacks come out of that and are sized by OMP_STACK_SIZE (or KMP_STACK_SIZE). One can also call KMP_SET_STACKSIZE_S before the first parallel region to set the thread stack size. It is not correct that if the linker stack size is 1GB that each thread asks for 1GB in stack.

Steve
Portrait de Tim Prince

When I last checked it, Intel OpenMP set a default thread stack size of 2MB when in 32-bit mode, 4MB in 64-bit mode.  A 1MB thread stack, as Microsoft used to use, could lead to serious cache associativity problems on older CPUs. As Steve says, those can be increased by OMP/KMP methods, and this (multiplied by number of threads) is taken out of the process stack limit, which typically has to be increased from the default by options such as /link /stack:800000000 or by the editbin tool.

Portrait de Roman

Hi,

Thanks for the replies.  In my initial post, I should have mentioned the following:

I am using 64-bit Windows 7, and I am definitely compiling 64-bit executables.  My computer has 20 GB of RAM, so memory should not be an issue.  When my program is compiled without OpenMP, it runs fine.  It only crashes with a stack overflow if /Qopenmp is used.  I have made the stack as large as possible, and tried calling KMP_SET_STACKSIZE_S() with large values, but it did not help.  It seems that for some reason, OpenMP needs additional stack space when doing reduction, and I have reached some kind of upper limit. Like I said in my initial post, if the size of x is smaller, everything works fine.

I would be happy to provide any more additional information.  Has anyone tried compiling my initial program to see if they can reproduce my problem?

Roman

Portrait de Steve Lionel (Intel)

I did and could reproduce it. I have asked some of my coworkers, who know more about OpenMP than I do, for their thoughts.

Steve
Portrait de jimdempseyatthecove

integer :: n ! same as integer(4)
...
n = 140000000 ! 140000000 = 0x08583B00 (fits in 4 bytes)
...
!$ stack_size = n*20 !  140000000 * 20 = 2800000000 = 0xA6E49C00 (as unsigned, this is negative number as signed)

Use "integer(kind=KMP_SIZE_T_KIND) :: n" 

Jim Dempsey

www.quickthreadprogramming.com
Portrait de Roman

This is a followup to the problem I was having. The OpenMP program I was working on was crashing with a stack overflow if the size of the reduction variable was very large. If anyone is interested, the attached file shows how I was able to solve this, by doing the reduction manually.  My solution isn't very pretty, but it does seem to work.

Roman

 

 

Fichiers joints: 

Fichier attachéTaille
Téléchargement test_omp_stack.f903.3 Ko

Connectez-vous pour laisser un commentaire.