How to detect a stack corruption?

How to detect a stack corruption?

Hi, All,

I have a very strange issue, well, I think it is a stack corruption of some sorts.

First of all, the Linux build (x86 or x64) and Windows (x64) are all ok with latest ifort v14 beta. (Actually, ifort v13.x gives wrong result for x64 bit build). The problem never happens with debug build on Windows, and will go away, if I enable some /check.

So here is the code segment (ncon_beco, kconnod_beco, ncmp_beco, kconcmp_beco are all globals defined in a module)

      implicit real (a-h,o-z)

      do 300 ki=1,ncon_beco
 !      print *, 'ki = ', ki
        inode1 = kconnod_beco(ki,1)
        inode2 = kconnod_beco(ki,2)
        
C--- Load slave force/torque solution cluster component index for components on both sides of connection
        do 250 kk=1,ncmp_beco
          inode = kcmpnod_beco(kk)
          if(inode.eq.inode1) kconcmp_beco(ki,1) = kk
          if(inode.eq.inode2) kconcmp_beco(ki,2) = kk
  250   enddo   
        
        kk1 = kconcmp_beco(ki,1)
      print *, 'kk1 = ', kk1, 'ki = ', ki, 'kconcmp_beco(ki,1) = ', kconcmp_beco(ki,1)
  300 enddo

Observation 1) If I print, "print *, 'ki = ', ki", the problem is gone.

Observation 2) if I use /check:bounds, the problem is gone

Observation 3) loop 250 is the real issue, it looks like it is not done: even though kcmpnod_beco(:) is ok, the 2 if() is bypassed, such that kconcmp_beco is not assigned a right value.

Observation 4) Win64 build (with the same compiler options) are working ok (ifort v14) or pass this crash (ifort v13.x) though run into a wrong result later.

Is this indicating some kind of alignment issue?

Normally when a print statement can fix a crash, I will look at the stack local variables around this 'print' location, but in above case, I did not see much in the code this time.

Any idea and suggestion?

8 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I forgot to mention one thing important.

For the Win32 build, I use the /arch:SSE2 to support SSE2. If I use /arch:ia32 for that code, the problem is gone. That is the reason I mention the possbility of data alignment. Following are compiler options that I think could be relevant:

/align:rec16byte /align:qcommons /align/sequence /real_size:64 /fpconstant /iface:cvf

For Win64, we do not have /iface:cvf, but "/iface:cref /assume:underscore". The stdcall vs cdecl could be relevant here.

I am afraid that there is too much analysis and too little presentation of evidence in the preceding. Please post a complete and self-contained example of code that displays the problem.

In Release build, you should be able to compile with debug database and place a break on the print in front of 300.
At break, you should be able to look at the disassembly window to ascertain what is going on. You can do the same with the first print uncommented. Compare the two outputs stripping out the code that generates the call to print and args pushed on stack.

I do not think this is an alignment issue. It is more likely an optimization issue.

Submitting a short reproducer will aid in analysing the issue.

Jim Dempsey

www.quickthreadprogramming.com

Jim,

Thanks for the tips, I tried to go to assembly language level, man, I have to admit, I am really not good at it, diff the 2 output is not going anywhere for me. But it does seem that "print *, 'ki = ', ki" right inside loop 300 have a major block change (not just a simple addition) to the assembly. Beyond that, I do not know what to look for.

There is already a case I reported several months ago for Intel v13.x, though at that time, I only reported a crash with x86/SSE2 build. Intel was able to reproduce the crash, but nothing more came from that.  Right now, I am moving on to see if ifort v14.x can fare better.

In house, I have been bugging my developers so many times to look into this crash, but they think the codes are fine.

There is a /check:stack option as of Composer XE 2013. You may also want to try running under the memory checker of Inspector XE - you can download a 30 day free trial if you like.

Steve - Intel Developer Support

Hi, Steve,

I forgot to mention this /check:stack thing yesterday: adding /check:stack on that single file, will make the crash go away (just like the print statement ("print *, 'ki = ', ki", of course, I know print statement normally will push a lot of extra stuff unto stack).

I have Parallel Studio XE 2013, and the model has already been run through the Inspector XE (debug build only), and nothing is found from the inspector run. But of course, the debug build can always run.

In the original issue I filed with Intel, I also noted that when I disable vectorization on a small loop, that will also make the crash go away. Not sure if I can reproduce that with latest ifort, though.

Putting all these together, I use 'stack corruption' for this thread, and in the intel's issue, I actually used the 'vectorization caused'. Of couse, it is just my speculation based on my observations.

Use "implicit none" then verify what you have shown us is actually what you have. i.e. assure you do not have a type-o in your code.

Jim Dempsey

www.quickthreadprogramming.com

Leave a Comment

Please sign in to add a comment. Not a member? Join today