I encountered a strange behavior.
When there exists
1. a DO LOOP index which has as an increment parameter of a variable,
2. reference to the DO LOOP index after the LOOP,
the object compiled in x64 RELEASE mode runs much slower than that in win32 RELEASE mode.
Here is a minimal sample program.
PROGRAM x64_Release_runs_slow IMPLICIT NONE INTEGER :: i, j, k REAL :: t0, t1 CALL CPU_TIME(t0) ! k = 1 DO i = 1, 10**8 DO j = 1, 10**2, k ! 1. use variable k as an increment parameter ! END DO END DO PRINT *, j ! 2. reference to the loop index j ! CALL CPU_TIME(t1) PRINT *, t1 - t0 STOP END PROGRAM x64_Release_runs_slow
This sample takes a few seconds in x64 RELEASE mode, while practically 0 seconds in win32 RELEASE mode.
This singularity disappears when changing k to constant 1 or commenting out the line "PRINT *, j".
In DEBUG mode x64 and win32 run with almost the same cpu_time.
I suppose this might be a optimization problem.
I attach a more realistic program with which I encountered this problem. (Option/assume:realloc_lhs is required.) In this case x64 version is ~40% slower than win32.