segfaults with ifort 8., array sizes, subroutines, -static, -static-libcxa ?

segfaults with ifort 8., array sizes, subroutines, -static, -static-libcxa ?

Hi,

I have been using the free version of ifc since v5.0. At work, we
switched recently to ifort v8.0 and some of our codes do not work
anymore due to segfaults. The problem happens on RH9.0 and Debian 3.0 (woody) (I did not test it with RH8 or older). Our computers
all have at least 1 Gb of RAM so I don't think the problem comes from
real memory limits.

Finally I managed to isolate a simple buggy case but could not
resolve it.

Consider this code :

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

program toto
implicit none
integer NC,NX,NY,NZ

NC=5
NX=62
NY=16
NZ=16

call plouf(NC,NX,NY,NZ)

end

subroutine plouf(NC,NX,NY,NZ)

implicit none
integer, intent(in):: NC,NX,NY,NZ
double precision,dimension(NC,NX,NY,NZ) :: glou
print*,NC,NX,NY,NZ
print*,"yes"

end subroutine

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

[I need to compile with a static-libcxa]

On RH9.0 :

1/ Compiled with -static-libcxa or without any command line option

Everythings goes fine in this case (compilation,run).
If I change NY and NZ to 64 instead of 16 everything works fine.
However if I increase NY and NZ again, I have a segfault (even though
this array fits into real memory).

2/ Compiled with -static

a segfault occurs during the call to the subroutine if NY and NZ equal 64,
but not if NY and NZ equal 16.

On Debian woody :

segfault in any case of compilation when NY=NZ=64.
No segfault for NY=NZ=16.

I tried several things : unlimit memoryuse, give higher limits
for stacksize (as there has been a recent discussion on ifort segfaulting
for this reason), check array bounds with -CB. Nothing can be done...

This codes works perfectly with Compaq, SGI, and IBM fortran.

Could this come from ifort 8.0 or libc or am I missing something ?
What should I do ?

Thanks a lot

21 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I have exactly the same problem on Slackware 9.1, and tried the same solutions with no avail.

It is a stack issue as far as i understand as lowering system dimensions helps. The program in consideration is a legacy code which is indeed memory-thirsty, but we have 2Gb and stack size is set to unlimited. So it is likely to be a bug of ifc 8.0. For now I have to fall back to 7.1 which is a pity.

Please if you find a solution to this problem - post it back on this forum. I promise to do the same if I'll find one.

Best of luck

Evgeniy

P.S. Exactly the same code compiled with 7.1 with exactly the same flags runs fine. top shows that at the point where the program compiled with 8.0 segfaults it uses approximately 180Mb memory. All ulimits are manually set to 1Gb. The segfault occurs on entrance into a subroutine which has no internal arrays, just the ones passed through arguments. These are passed from the caller where it is one 1d array into the sub where these are interpreted as 3d arrays.
So stack might not be the issue unless there is a memory leak in the compiler code. Alignment was another suspect, i tried "-align all" but that does not help.

Any ideas?

ifort 8.0 is reputed to consume more stack than 7.1 in most situations. I would guess that you are passing arguments as (f77 terminology) variable dimensioned arrays. The compiler would be generatingcode tocompute a bunch of addresses and strides,and push them on the stack, for later use inside loops. If you were close to the stacklimit before, you may have to increase it.

As I said, maximum memroy usage at the point of segfault is 180Mb, stack size limit is 1Gb, so that would be a very big jump in stack usage :)

Community Admin's picture

The question that I posted just now, "optimization problem", I feel is closely related to your problem. It is mentioned on the "known problems and issues" on Intel Fortran Compiler for Linux page that V 8.0 will consume more stack space and can die with a signal 11.

http://www.intel.com/support/performancetools/fortran/stack.htm

If you see my problem, the segmentaion fault occurs only after a particular problem size when calling a sub-routine at O3 level optimization only. O3 causes a lot of aggressive optimization and I am guessing, only guessing, that when calling a library routine, a LOT of stack space is being used, apparently runs out of space and dies. However, I am still confused as to why this doesnt happen if I use -g. I have tried setting limits manually to soft large values like 1 GB and 1.5 GB. "unlimited" is not a very good choice as it sets hard limits. This is also mentioned in the release notes for V 8.0 compilers. Can you try to compile your code with -g added and see if you still get segmentation faults?

thanks.

regards,
Kalyan

I tried it with -g and without optimization, when I was running it with gdb - does not help, segfaults at the same place unfortunately.

Community Admin's picture

setting the limits (stack, data, max mem, max mem locked) manually to large values like 1 GB and increasing the per thread stack size (KMP_STACKSIZE) finally solved my problem. Hope this helps.

Maybe you should raise a permier issue.

thank you,
kalyan

I have experienced an odd problem also with larger memory.

It occured only when using array notation as opposed to DO loop.

According to Intel Premier support, my problem can be resolved
if it naturally links to the libc in /lib/i686/*
instead of /lib

Only some of my installations have /lib/i686.

It may be that you need to install a specific glibc RPM package for the i686, as opposed to the generic.

That this isn't mentioned in documentation I think is a bug.

Try that for your problems.

Hi!

I have the same problem with large amounts of memory...

The legacy F77 code (not OpenMP) uses a large COMMON block as a storage for all arrays:
integer(8), parameter :: na = 280000000
real aa; common // aa(na)

The program works fine when the size of this block na < 200000000. But when the size is increased, the program Seg Faults:
When linked with shared libs (/lib/i686/*), the executable segfaults immediatelly after start, and ldd says "not dynamic executable"???
When statically linked, the program starts, but segfaults while filling the arrays with data...

The behaviour is the same for full optimized code (-O3) and in debug version (-g) without any optimization.
The hard limits for memory (stack, data, max mem, max mem locked) are all set to "unlimited", and soft limits are set to 2000000 kbytes.

The system is SuSe Linux 9.0 Pro on Pentium4 HT with 3 Gb RAM

So, still looking for any ideas... Thanks in advance.
WBR, Andrey

Hi
I had similar problems when I updated to 8.0. I use the workaround of explicitly allocating/deallocating large local arrays with ALLOCATE. This is not so nice but works for now...

Cheers
D

Hi!

In my case I've also found simple problem workaround using dynamic allocation. I use the option -dyncom of the compiler. I've just replaced "common //" to "common /global/" in all source files, and compiledthem with ifort -dyncom "global"...

It works, but seems to be slower then allocation of COMMON block on stack...

WBR, Andrey

Steve Lionel (Intel)'s picture

COMMON is not allocated on the stack - it is statically allocated. Yes, using dynamic common is slower because there is a run-time allocation process and you are accessing memory through pointers, which inhibits some optimization.

Steve
Martyn Corden (Intel)'s picture

Note that, on most IA-32 Linux distributions, shared objects get loaded at 1 GB by default. If youtry to load a static array of more than 800-900 MB, it may overwrite these, giving a seg fault, as you observe. This is an operating system issue or feature, not a compiler one. See the FAQ http://support.intel.com/support/performancetools/fortran/linux/sb/CS-007795.htm for suggested workarounds.

Martyn

Hi,
I think I managed to localize the problem. Consider code:
!**************************************
PROGRAM Test
print *, "Call..."
CALL XNS(100)
END program Test
SUBROUTINE XNS(II)
real, dimension (1:II,1:II,1:II,3) :: rTestStack
rTestStack=1.
print *, "Done."
RETURN
END subroutine xns
!**************************************
This code wants only 11 Mb of stack frame (with dp=8bytes). That is lower then current both soft and hard limits. It segfaults.
However the following code:
!**************************************
PROGRAM Test
print *, "Call..."
CALL XNS(100)
END program Test
SUBROUTINE XNS(II)
real, dimension (1:100,1:100,1:100,3) :: rTestStack
rTestStack=1.
print *, "Done."
RETURN
END subroutine xns
!**************************************
Works fine. Although the stack size is the same - I suppose that is a bug.I submitted the issue to premier support, hope they will fix it.

Martyn Corden (Intel)'s picture

My reply above addresses the example of Andrey.

EShapiro has provided an excellent test case, and that should certainly be addressed by Premier Support.

Martyn

Hi,
I have a similar problem with this compiler. When i run my research code in v71 it works fine, but when i port it to v80 it segmentates at a particular routine. It seems to allocate array sizes of the order of 1 gig for certain arrays in the argument list. Unfortunately at this moment i have no solution, but to stay with v71.
ash

I reported a similar problem to Intel support. The suggested solution was to add the -nothread option when using -static. This worked for me (ifc 8.0.39_pe040).

Some more info - this is a libpthreads problem as i understand since the threshold for the segfault is 2Mb and ldd shows that the executable is linked with libpthreads (althoug -nothreads option is specified). Same code compiled with ifc 7.1 does not link to libpthreads. I reported it to technical support but have not received any answer yet.

If you experience the same problem, you can try to compile the code with "-nothreads" and then run "ldd 'executable'" if libpthreads is there - please contact technical support about it (probably they will react faster).

Using the same code as in the original post of this topic by "concombre", I still get the segfault for NY=NZ=64, even when applying the -nothreads option. For NY=NZ=16 it works fine.
It does not seem to be due to the libpthreads library, since, with the -nothreads flag, ldd does not show this library.

I have:
>ifort -nothreads -o toto toto.f90

>toto
segmentation fault

>ldd toto

libm.so.6 => /lib/i686/libm.so.6 (0x4002f000)
libcxa.so.5 => /opt/Intel/intel_fc_80/lib/libcxa.so.5 (0x40052000)
libunwind.so.5 => /opt/Intel/intel_fc_80/lib/libunwind.so.5 (0x40078000)
libc.so.6 => /lib/i686/libc.so.6 (0x4007e000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

I am runnig the free versions of ifv 7.1 and ifort 8 on Red Hat 7.2
Has anyone received a solution yet?

According to my informers, a static link to libpthreads results in a fixed stack limit for each thread. If you have linked statically, libpthreads will not show up in ldd. If you link against libpthread.so, it will show up in ldd. Then, you should be able to use your shell command to control stack size limit. The 7.1 and 8.0 compilers currently differ in whether -static causes libpthreads to be linked statically. Future 8.x compilers may use a dynamic pthreads link, even though you request static, in order to avoid this fixed stack limit.

You may need one of the most recent compiler updates, either 7.1 or 8.0, to get a link which selects .so libraries correctly for certain recent linux distributions.

Login to leave a comment.