Stack Overflow and FORALL

Stack Overflow and FORALL

I have a simulation software that I am part of the development which runs a variety of different model inputs. For one of the model inputs triggers a stackoverflow crash that only occurs in debug mode. Below is what is printed to the command prompt and I have attached an image of what pops up when I run debug within VS. (I apologize for deleting the information pertaining the specific files.)

######################################################

forrtl: severe (170): Program Exception - stack overflow

Image              PC           Routine    Line      Source

PROG_Debug   0000000141A3CB77 

PROG_Debug   00000001405ED7D0 

PROG_Debug   00000001400C4C4A

PROG_Debug   0000000141815D36 

PROG_Debug   0000000141ABA386  Unknown      Unknown  Unknown

PROG_Debug   0000000141A3D1FC  Unknown      Unknown  Unknown

PROG_Debug   0000000141A3D33E  Unknown      Unknown  Unknown

kernel32.dll 00000000770959ED  Unknown      Unknown  Unknown

ntdll.dll    00000000771CC541  Unknown      Unknown  Unknown

######################################################

Looking at the line numbers for the code it crashes on the following FORALL statement

         FORALL(I=1:NROW,J=1:NCOL,K=1:NLAY,ILOC(J,I,K).NE.0)
     +     BUFF(J,I,K)=BUFF(J,I,K)/( DR(J)*DC(I)*(BTM(J,I,LM(K)-1)-BTM(J,I,LM(K))) )

(Note that in this run NROW=150, NCOL=150, NLAY=6, so they are not big arrays and the LM index pointer array is correct)

The indexing is fine and all the variables are fine (used in other parts of the code). When I run the code on release it does not crash, but the variable BUFF is dumped to a file after that FORALL, but that happens as if the FORALL acts like a RETURN statement.

When I change the FORALL to a DO CONCURRENT as follows:

         DO CONCURRENT (I=1:NROW,J=1:NCOL,K=1:NLAY,ILOC(J,I,K).NE.0)
          BUFF(J,I,K)=BUFF(J,I,K)/( DR(J)*DC(I)*(BTM(J,I,LM(K)-1)-BTM(J,I,LM(K))) )
         END DO

The code in debug mode behaves as expected. Is this a bug, or am I missing something with regards to FORALL statements. Should I go through my code and clean out the FORALL in favor of DO CONCURRENT.

Thanks as always.

 

 

AttachmentSize
Download Debug Assertion Failed127.02 KB
6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Correction: DO CONCURRENT crashes on release, but not on debug. I am assuming its at the same location because the program runs when I comment out both the DO CONCURRENT and FORALL. The DO CONCURRENT on release gives the following error:

forrtl: error (65): floating invalid
Image              PC                Routine            Line        Source
PROG_x64.exe    000000013FCB045A  Unknown               Unknown  Unknown
PROG_x64.exe    000000013FB815AF  Unknown               Unknown  Unknown
PROG_x64.exe    0000000140049858  Unknown               Unknown  Unknown
PROG_x64.exe    00000001401547C6  Unknown               Unknown  Unknown
PROG_x64.exe    0000000140138548  Unknown               Unknown  Unknown
kernel32.dll       00000000770959ED  Unknown               Unknown  Unknown
ntdll.dll          00000000771CC541  Unknown               Unknown  Unknown

 

However the following code runs fine on both debug and release:

         DO I=1,NROW
         DO J=1,NCOL
         DO K=1,NLAY
           IF(ILOC(J,I,K).NE.0) THEN
             BUFF(J,I,K)=BUFF(J,I,K)/( DR(J)*DC(I)*(BTM(J,I,LM(K)-1)-BTM(J,I,LM(K))) )
          END IF
         END DO
         END DO
         END DO

 

and produces the correct result to the output file.

 

Also I realized that error image I uploaded before did not upload correctly. It will work, you just have to add a jpg extension. I reattached it here, so this is the error I get with the FORALL and debug mode running through VS:

 

Steve Lionel (Intel)'s picture

Stack overflow is pretty simple. Either raise the stack reserve limit in the Linker properties (default is 1MB), or compile with /heap-arrays (Fortran > Optimization > Heap Arrays > 0)

Floating invalid is also pretty simple. Without seeing an actual test case, it's hard to speculate further.

Steve

The reason I think its a bug is that the stack overflow does not occur for release, but instead the code does not even run. Also when I changed the code to a series of DO loops and if statement the stack is fine. So why would there be an overflow for identical code in a FORALL, a floating point error with a DO CONCURRENT, and the code runs fine with a double DO and IF statements.

Steve Lionel (Intel)'s picture

Please provide a test case we can investigate.

Steve

Here is a simple program at triggers the error when I run it in debug mode in Visual Studio 2012. I was not able to recreate the floating point error I got with the DO CONCURRENT and release mode. The first part of the code is just setup and the initial values in BUFF are just random numbers so its not doing a 0/1 case.

 

      PROGRAM LOOP_EVAL
      IMPLICIT NONE
      INTEGER::NROW,NCOL,NLAY
      INTEGER,DIMENSION(:,:,:),POINTER,CONTIGUOUS::ILOC
      DOUBLE PRECISION,DIMENSION(:,:,:),POINTER,CONTIGUOUS::BUFF,BTM
      DOUBLE PRECISION,DIMENSION(:),    POINTER,CONTIGUOUS::DR,DC,LM
      REAL,DIMENSION(150,150,10)::TEMP
      INTEGER:: I,J,K
      !
      !##########################################################
      !SET UP PARAMETERS
      NROW=150
      NCOL=150
      NLAY=10
      !
      ALLOCATE(BUFF(NROW,NCOL,NLAY),ILOC(NROW,NCOL,NLAY))
      ALLOCATE(DR(NCOL),DC(NROW),BTM(NROW,NCOL,0:NLAY),LM(NLAY))
      !
      CALL RANDOM_NUMBER(TEMP)
      WHERE(TEMP>0.3)       !JUST USED TO POPULATE ILOC.NE.0
        ILOC=1
      ELSEWHERE
        ILOC=0
      END WHERE
      !
      DR=200D0
      DC=100D0
      !
      BTM(:,:,0)=0D0
      DO I=1,NLAY
        LM(I)=I
        BTM(:,:,LM(I))=-100D0*DBLE(I)
      END DO
      !
      CALL RANDOM_NUMBER(TEMP)
      BUFF=DBLE(TEMP+0.1)*100
      !
      !##########################################################
      DO I=1,NROW                                               !WORKS FINE
      DO J=1,NCOL
      DO K=1,NLAY
          IF (ILOC(I,J,K).NE.0) THEN
             BUFF(I,J,K)=BUFF(I,J,K)/( DR(J)*DC(I)
     +                    *(BTM(I,J,LM(K)-1)-BTM(I,J,LM(K))) )
          ELSE
              BUFF(I,J,K)=0D0
          END IF
      END DO
      END DO
      END DO
      !
      !##########################################################
      BUFF=DBLE(TEMP+0.1)*100                                      !RESET BUFF FOR NEXT TEST
      !
      DO CONCURRENT (I=1:NROW,J=1:NCOL,K=1:NLAY,ILOC(I,J,K).NE.0)  !UNABLE TO CAUSE FLOATING POINT ERROR WITH RLS AS WAS SEEN IN ORIGINAL CODE
          BUFF(I,J,K)=BUFF(I,J,K)/( DR(J)*DC(I)
     +                    *(BTM(I,J,LM(K)-1)-BTM(I,J,LM(K))) )
      END DO
      DO CONCURRENT (I=1:NROW,J=1:NCOL,K=1:NLAY,ILOC(I,J,K).EQ.0)  !UNABLE TO CAUSE FLOATING POINT ERROR WITH RLS  
          BUFF(I,J,K)=0D0
      END DO
      !
      !##########################################################
      BUFF=DBLE(TEMP+0.1)*100                                      !RESET BUFF FOR NEXT TEST
      !
      WHERE(ILOC.EQ.0) BUFF=0D0                           !CAUSES ASSERTION FAILURE WHEN RUNNING DEBUG MODE IN VISUAL STUDIO
      FORALL(I=1:NROW,J=1:NCOL,K=1:NLAY,ILOC(I,J,K).NE.0) !IF WHERE IS COMMENTED THE FOLLOWING STILL CAUSES ASSERTION FAILURE
          BUFF(I,J,K)=BUFF(I,J,K)/( DR(J)*DC(I)
     +                    *(BTM(I,J,LM(K)-1)-BTM(I,J,LM(K))) )
      END FORALL
      !
      !##########################################################
      !
      END PROGRAM
 

 

Login to leave a comment.