I believe I found a bug in the Fortran compiler 11.0 (Linux).
Our lab is a licensed user of the product and I opened an issue on the Intel Premier
Account Interface before, but received no immediate help.
I have example code for the problem, which is given at the very end of this message.
The original bug essentially came down to the following:
If trigonometric functions are called inside a loop with complex data structures
(vectorizer complains "subscript too complex"), they appear to break the dependency
recognition, force vectorization and hence create an incorrect result:
(Inacceptable) fixes include:
Compiler flags:
-O0
-O1
-no-vec
Code changes:
- Explicitly introduce dependence in the loop through write-statements
- Simplify data structures
- Use other intrinsic functions (like sqrt) instead of trigonometric ones (by now
i have a similar bug, however, which does not involve a single intrinsic)
Architecture changes:
- go to 32-bit
For reference, example code is given below. When compiled with -O3 as the only
option it yields a value of zero for the variable "val" which is clearly incorrect.
It can be fixed by any (or more) of the fixes listed above. This is a major bug for
scientific code such as the code we work on and hence very important to us.
The previous compiler we used (version 10.1) does deal with this correctly.
Please note that since this first bug I encountered another similar issue with
complicated data structures and simple loops in which data dependency is
not recognized properly. Results are strictly false without any proper warning
or documentation even when using -vec-report3.
I wonder whether this has been fixed in 11.0.081 or whether it is at least being
worked on? Right now, version 11 is a strict no-go for us,
Andreas
---------------------------------------------------------
module one
type t_dummy
real(8), ALLOCATABLE:: it(:,:)
real(8), ALLOCATABLE:: f(:)
end type t_dummy
type(t_dummy), ALLOCATABLE:: dummy(:)
integer, ALLOCATABLE:: imap(:)
integer, ALLOCATABLE:: bmap(:)
end module one
program abug
use one
implicit none
integer ndum,da,db,i
real(8), ALLOCATABLE:: prns(:)
real(8) val
ndum = 10
da = 2
db = 100
allocate(dummy(ndum))
do i=1,ndum
allocate(dummy(i)%it(da,db))
end do
allocate(bmap(ndum))
allocate(prns(ndum))
call random_number(prns)
bmap(:) = int(10.0*prns - 0.5) + 1
deallocate(prns)
allocate(imap(db))
allocate(prns(db))
call random_number(prns)
imap(:) = int(30.0*prns - 0.5) + 1
deallocate(prns)
call s2(val)
write(*,*) val
deallocate(imap)
deallocate(bmap)
do i=1,ndum
deallocate(dummy(i)%it)
end do
deallocate(dummy)
end
subroutine s1(i,ttc,arg3,val)
use one
implicit none
integer i,j,ttc
real(8), INTENT(in):: arg3
real(8), INTENT(inout):: val
real(8) rvec(3)
call random_number(rvec)
do j=1,3
dummy(bmap(i))%it(ttc,3*imap(i)+j) = arg3*rvec(j)*sin(rvec(1))
val = val + dummy(bmap(i))%it(ttc,3*imap(i)+j)
end do
end
subroutine s2(val)
use one
implicit none
integer i,ttc,j,imol
real(8) arg3
real(8), INTENT(inout):: val
ttc = 1
imol = 1
dummy(imol)%it(ttc,:) = 0.0
val = 0.0
do i=1,9
call random_number(arg3)
call s1(i,ttc,arg3,val)
end do
end