Hey, I recently experienced slow performance for a small toy program (attached) compiled with ifort 14.0.3 and -O3/-Ofast compared to -O2 and also compared to gfortran -O2/-O3.
When using gfortran 4.9.0:
gfortran -O2 -o fannkuch_gcc fannkuch.f90 && time ./fannkuch_gcc 11
takes 3.15s, and with -O3 it takes 2.75s.
When using ifort 14.0.3:
ifort -O2 -o fannkuch_intel fannkuch.f90 && time ./fannkuch_intel 11
it takes 3.03s, but with -O3 or -Ofast it goes to 4.8s. When replacing the array copy with an explicit loop in the source code, performance is better, but still worse than -O2 and nowhere near gfortran's -O3. I didn't spot any obvious differences with -vec-report or -opt-report.
I realize this is just a tiny program, so maybe it's normal to expect some over-optimization problems?