How to detect a stack corruption?

How to detect a stack corruption?

Imagen de a.zhaogtisoft.com

Hi, All,

I have a very strange issue, well, I think it is a stack corruption of some sorts.

First of all, the Linux build (x86 or x64) and Windows (x64) are all ok with latest ifort v14 beta. (Actually, ifort v13.x gives wrong result for x64 bit build). The problem never happens with debug build on Windows, and will go away, if I enable some /check.

So here is the code segment (ncon_beco, kconnod_beco, ncmp_beco, kconcmp_beco are all globals defined in a module)

      implicit real (a-h,o-z)

      do 300 ki=1,ncon_beco
 !      print *, 'ki = ', ki
        inode1 = kconnod_beco(ki,1)
        inode2 = kconnod_beco(ki,2)
        
C--- Load slave force/torque solution cluster component index for components on both sides of connection
        do 250 kk=1,ncmp_beco
          inode = kcmpnod_beco(kk)
          if(inode.eq.inode1) kconcmp_beco(ki,1) = kk
          if(inode.eq.inode2) kconcmp_beco(ki,2) = kk
  250   enddo   
        
        kk1 = kconcmp_beco(ki,1)
      print *, 'kk1 = ', kk1, 'ki = ', ki, 'kconcmp_beco(ki,1) = ', kconcmp_beco(ki,1)
  300 enddo

Observation 1) If I print, "print *, 'ki = ', ki", the problem is gone.

Observation 2) if I use /check:bounds, the problem is gone

Observation 3) loop 250 is the real issue, it looks like it is not done: even though kcmpnod_beco(:) is ok, the 2 if() is bypassed, such that kconcmp_beco is not assigned a right value.

Observation 4) Win64 build (with the same compiler options) are working ok (ifort v14) or pass this crash (ifort v13.x) though run into a wrong result later.

Is this indicating some kind of alignment issue?

Normally when a print statement can fix a crash, I will look at the stack local variables around this 'print' location, but in above case, I did not see much in the code this time.

Any idea and suggestion?

publicaciones de 8 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.
Imagen de a.zhaogtisoft.com

I forgot to mention one thing important.

For the Win32 build, I use the /arch:SSE2 to support SSE2. If I use /arch:ia32 for that code, the problem is gone. That is the reason I mention the possbility of data alignment. Following are compiler options that I think could be relevant:

/align:rec16byte /align:qcommons /align/sequence /real_size:64 /fpconstant /iface:cvf

For Win64, we do not have /iface:cvf, but "/iface:cref /assume:underscore". The stdcall vs cdecl could be relevant here.

Imagen de mecej4

I am afraid that there is too much analysis and too little presentation of evidence in the preceding. Please post a complete and self-contained example of code that displays the problem.

Imagen de jimdempseyatthecove

In Release build, you should be able to compile with debug database and place a break on the print in front of 300.
At break, you should be able to look at the disassembly window to ascertain what is going on. You can do the same with the first print uncommented. Compare the two outputs stripping out the code that generates the call to print and args pushed on stack.

I do not think this is an alignment issue. It is more likely an optimization issue.

Submitting a short reproducer will aid in analysing the issue.

Jim Dempsey

www.quickthreadprogramming.com
Imagen de a.zhaogtisoft.com

Jim,

Thanks for the tips, I tried to go to assembly language level, man, I have to admit, I am really not good at it, diff the 2 output is not going anywhere for me. But it does seem that "print *, 'ki = ', ki" right inside loop 300 have a major block change (not just a simple addition) to the assembly. Beyond that, I do not know what to look for.

There is already a case I reported several months ago for Intel v13.x, though at that time, I only reported a crash with x86/SSE2 build. Intel was able to reproduce the crash, but nothing more came from that.  Right now, I am moving on to see if ifort v14.x can fare better.

In house, I have been bugging my developers so many times to look into this crash, but they think the codes are fine.

Imagen de Steve Lionel (Intel)

There is a /check:stack option as of Composer XE 2013. You may also want to try running under the memory checker of Inspector XE - you can download a 30 day free trial if you like.

Steve
Imagen de a.zhaogtisoft.com

Hi, Steve,

I forgot to mention this /check:stack thing yesterday: adding /check:stack on that single file, will make the crash go away (just like the print statement ("print *, 'ki = ', ki", of course, I know print statement normally will push a lot of extra stuff unto stack).

I have Parallel Studio XE 2013, and the model has already been run through the Inspector XE (debug build only), and nothing is found from the inspector run. But of course, the debug build can always run.

In the original issue I filed with Intel, I also noted that when I disable vectorization on a small loop, that will also make the crash go away. Not sure if I can reproduce that with latest ifort, though.

Putting all these together, I use 'stack corruption' for this thread, and in the intel's issue, I actually used the 'vectorization caused'. Of couse, it is just my speculation based on my observations.

Imagen de jimdempseyatthecove

Use "implicit none" then verify what you have shown us is actually what you have. i.e. assure you do not have a type-o in your code.

Jim Dempsey

www.quickthreadprogramming.com

Inicie sesión para dejar un comentario.