XE 2013 SP1 /O3 optimization bug

XE 2013 SP1 /O3 optimization bug

The following code snipped is from a DO-loop over the variable I. When I compile with /O3 I get ILA2(I) overwritten with an impossible value JJ (which seem to come from the preceding I-loop), but which is semantically impossible as ISUM==1 thus JJ should also be updated.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                ISUM = 0
                DO J=ILA1(I),ILA2(I)
                        select case (MERKV(IAMA(J)))
                        case (MERKV_PREDEFINED_UNUSED,MERKV_UNDEFINED)
                            ISUM= ISUM+1
                            IV  = IAMA(J)
                            JJ  = J
                        end select
                ENDDO
                IF (ISUM .EQ. 1) THEN
                    MERKG(I) = MERKG_USED
                    ILAUF=ILAUF+1
                    
                    select case (MERKV(IV))
                    case (MERKV_UNDEFINED)
                        MERKV(IV) = MERKV_DEFINED

                    case (MERKV_PREDEFINED_UNUSED)
                        MERKV(IV) = MERKV_PREDEFINED_USED
                    end select
                    
                    ILA2(I) = JJ                                    <<<<<<<<<<----------------------- overwrite happens here
                    NREIH = NREIH+1
                    IREIH(NREIH) = I
                    
                  IF(K .LE. 3) THEN
                    CALL SIMCONV_ADDVARDEF( I, IV )
                  ENDIF
                ENDIF

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

If I replace

                        select case (MERKV(IAMA(J)))
                        case (MERKV_PREDEFINED_UNUSED,MERKV_UNDEFINED)
                            ISUM= ISUM+1
                            IV  = IAMA(J)
                            JJ  = J
                        end select

from the first DO-loop by

                        if (MERKV(IAMA(J)) .le. MERKV_UNDEFINED) then

                            ISUM= ISUM+1
                            IV  = IAMA(J)
                            JJ  = J
                        end if

which in the program-logic is semantically equivalent then /O3 does not result in errornous code.

regards

Tobias

9 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

I just figured out that the optimization error already happens with  /O2 /Qparallel /Qpar-threshold:90 /Qvec-threshold:90

Does adding just /Qvec- to disable automatic vectorization eliminate the problem? I just found that the vectorizer appears to be broken with a similar loop with an externally set index being incremented (http://software.intel.com/en-us/forums/topic/472668).

Hi Stuart,

you are right: adding /Qvec- eliminates the problem. So, the auto-vectorizer is broken, which IMO breaks the whole optimizer as you don't know what else is broken.

I really would like to see some comments from Intels quality management to this issue.

regards

Tobias

OK, Tobias. Looks like we are in the same boat. I agree that this is serious.

Cheers,
Stuart

Steve Lionel (Intel)的头像

Please provide a complete test case and we'll be glad to look at it. There's nothing we can do with an excerpt. DIsabling the vectorizer significantly changes the code, and it might be hiding a coding error in the application.

Steve

@ Steve: here is Stuart's code as program, which give wrong results when compiled with

/nologo /O3 /Qparallel /Qpar-threshold:1 /Qvec-threshold:1 /module:"Release\\" /object:"Release\\" /libs:static /threads /c

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

    program Console4

    implicit none

    integer*4 i,n,j,k
    
    
    integer*4, parameter :: M1B = 10
    real*8 r,F(M1B),dr,drod,rd(M1B*M1B)
    integer*4 M2(M1B)
    ! Variables

         DO I = 1, M1B
             M2(I)=I
             F(I)=I
         enddo    
        DROD=2.0

        RD= 0

        N = 0
        R = 2.0
         DO I = 1, M1B
          K = M2(I)
          DR = DROD*F(I)
          DO J = 1, K
            R = R + DR
            N = N + 1
            RD(N) = R
            R = R + DR
          END DO
         END DO        

         DO I = 1, N
             print *, RD(I)
         enddo    
         
         N=I

    end program Console4
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Output of debug version (non-optimized):

   4.00000000000000
   10.0000000000000
   18.0000000000000
   28.0000000000000
   40.0000000000000
   52.0000000000000
   66.0000000000000
   82.0000000000000
   98.0000000000000
   114.000000000000
   132.000000000000
   152.000000000000
   172.000000000000
   192.000000000000
   212.000000000000
   234.000000000000
   258.000000000000
   282.000000000000
   306.000000000000
   330.000000000000
   354.000000000000
   380.000000000000
   408.000000000000
   436.000000000000
   464.000000000000
   492.000000000000
   520.000000000000
   548.000000000000
   578.000000000000
   610.000000000000
   642.000000000000
   674.000000000000
   706.000000000000
   738.000000000000
   770.000000000000
   802.000000000000
   836.000000000000
   872.000000000000
   908.000000000000
   944.000000000000
   980.000000000000
   1016.00000000000
   1052.00000000000
   1088.00000000000
   1124.00000000000
   1162.00000000000
   1202.00000000000
   1242.00000000000
   1282.00000000000
   1322.00000000000
   1362.00000000000
   1402.00000000000
   1442.00000000000
   1482.00000000000
   1522.00000000000

Output of release version (optimized):

   4.00000000000000
   10.0000000000000
   18.0000000000000
   28.0000000000000
   40.0000000000000
   52.0000000000000
   66.0000000000000
   82.0000000000000
   98.0000000000000
   114.000000000000
   132.000000000000
   152.000000000000
   172.000000000000
   192.000000000000
   212.000000000000
   234.000000000000
   258.000000000000
   282.000000000000
   306.000000000000
   330.000000000000
   354.000000000000
   380.000000000000
   408.000000000000
   436.000000000000
   464.000000000000
   492.000000000000
   520.000000000000
   548.000000000000
   578.000000000000
   610.000000000000
   642.000000000000
   674.000000000000
   706.000000000000
   738.000000000000
   770.000000000000
   802.000000000000
   708.000000000000             <<<<<<<---- the first wrong value
   744.000000000000
   780.000000000000
   816.000000000000
   852.000000000000
   888.000000000000
   924.000000000000
   960.000000000000
   852.000000000000
   890.000000000000
   930.000000000000
   970.000000000000
   1010.00000000000
   1050.00000000000
   1090.00000000000
   1130.00000000000
   1170.00000000000
   1210.00000000000
   1090.00000000000

@Stuart: I hope you don't mind that I took your example. (I have no clue what it does, but it does it wrong in the optimized version ;-)

No problem. I just posted an even smaller example for Steve under my original posting.

Steve Lionel (Intel)的头像

Thanks - got it.

Steve

登陆并发表评论。