Unrolling problem in Intel Fortran v14.0.2.144 (x86_64/EM64T)

Unrolling problem in Intel Fortran v14.0.2.144 (x86_64/EM64T)

P { margin-bottom: 0.08in; }

Hi,

 

I will report a problem with unrolling of a very simple loop:

 

DO 200 I = 1, NDIM

VEC(I) = A(I*(I+1)/2)

200 CONTINUE

 

Something goes wrong when the 'A' and 'VEC' arrays/pointers refer to the same memory location and the memory address itself is not aligned on 16-byte boundary. When this piece of code is compiled with '-O2' we always get wrong results for the the first [2:k] elements of VEC, where k depends on certain value of 'NDIM'. However, we got always correct results if unrolling is completely disabled either by passing the '-unroll=0' option to compiler or by using the DEC attribute:

 

cDEC$ NOUNROLL

DO 200 I = 1, NDIM

VEC(I) = A(I*(I+1)/2)

200 CONTINUE

 

Just to show the problem, below is given output from our real test:

 

#1

COPDIA NDIM: 67

ADRESSES :38897040 38897040

ADDRESSES ARE NOT ALIGNED ON 8-BYTE BOUNDARY

2.258586282289821E-004 2.258586282289821E-004 T

1.129034497633417E-003 1.790368778501193E-003 F

1.790368778501193E-003 2.945284711568669E-003 F

1.873467396476006E-003 2.331244172747492E-002 F

2.618478092336402E-003 4.152520189742853E-002 F

2.945284711568669E-003 6.745867362432861E-002 F

1.335999185554419E-002 0.114438101565396          F

1.345512252979789E-002 0.294939942339324          F

1.676828907801860E-002 0.644754785440931          F

2.331244172747492E-002 1.15038004433901            F

3.037574582317423E-002 1.84400534002079            F

3.208721233146836E-002 3.208721233146836E-002 T

....

#2

COPDIA NDIM: 27

ADRESSES :37716080 37716080

ADDRESSES ARE NOT ALIGNED ON 8-BYTE BOUNDARY

3.769405624512559E-002 3.769405624512559E-002 T

8.894699416983025E-002 9.866743377462182E-002 F

9.866743377462182E-002 0.292273738386099          F

0.231468604273682 0.384206422411758                   F

0.247144738198662 0.566242246806149                   F

0.292273738386099 1.42326482050238                     F

0.303953189347391 0.303953189347391                   T

....

 

here, we are comparing the 'VEC' arrays resulted from the optimized (unrolled) and modified (non-unrolled) versions of the loop; ADRESSES correspond to memory locations of 'A', and 'VEC' arrays.

 

Enclosed please find a code snippet used for our testing purposes as well as output resulted from a real test. Since  our is project is a quite big one, we cannot provide the whole source code.

 

With best regards,

Victor.

AdjuntoTamaño
Descargar Unrolling.tar30 KB
publicaciones de 3 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.

As your source code violates the Fortran standard, you must set -assume dummy_aliases if you hope to have this work in any reasonable way.  As your attached source code doesn't compile, and you didn't mention this point, I can't verify whether that makes the difference.

Hi Tim,

thank you very much for your expertise and hint suggested! Indeed, passing the '-assume dummy_aliases' option to compiler solves the problem. Moreover, allocating and using a temporary array also works out. I just will mention that Intel v13 as well as other Fortran's compilers produce correct results, regardless of arguments aliasing.

With best regards,

Victor.

Deje un comentario

Por favor inicie sesión para agregar un comentario. ¿No es socio? Únase ya