problem with optimization > -O0

problem with optimization > -O0

when porting(fortran) codes to Linux/Intel andgettingthem running
Ihad toset the optimization level to -O0 at least for some files.
Otherwise the codes crash. (Note that the codes in question
run without problems on UNIXes using different compilers and
high optimization levels). Strangely there is no difference
between 7.1 and 8.0 Intel compilers in that respect.
Obviously -O0 optimized code runs slow (which is most strongly
noticeble on Itanium).

Whatexactly-O1 does?
Can I manually enable some of -O1 features but not all
so thatto figure out what kind of optimization crashes the codes?

Unfortunately all my attempts to improveperformance of the codes
compiled with -O0by adding other switches did not make significant difference.


10 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I would suggest you try to identify what causes the programs to fail at -O2. Report the problem to Intel Support if it looks like a compiler bug.

Steve - Intel Developer Support

If your program depends on implicit static data initialized to 0, the switches -zero and -save may help.

With ifort 8.0, -O1 is the same as -O2. With ifc 7.1, -O1 will not vectorize.

-zero is not yet implemented in 8.0.

Steve - Intel Developer Support

Many thanks for the messages!
Yes, -zero is unfortunately not implemented.
And I wonder if support for gprof is fully provided.

I wanted to find out a hotspot of a poorly performingsubroutine which I had to compile with -O0 but I could not get profiling to work on Itanium. ifort 8.0 can compile a simple test program with -p (or -qp) butthe resulting code crashes. Can this be related to libgcc or kernel version?

It also did occur to me the problem with optimization could be related to vectorization. Is this the main difference between -O0 and -O1? Does it make sense totry -O1 and !DEC$ NOVECTOR before every loop?

You don't get vectorization unless you use -xN or -xW (or -xP).

Have you tried VTune?

Steve - Intel Developer Support

My problem is Itanium, not P4. So -xW etc are not applicable.

I have not tried VTune on Itanium yet. VTune requires very specific versions of libraries and warns that it may not work with others. I am not prepared to change the libs as yet.

Profiling with gprof does not work on the
Itanium. It messes up argument passing between
different modules (see attached test case).

I have submitted a bug report about this (264079),
but no answer so far.

For ifort Itanium, optimization of at least -O1 -mp is generally required, to get any reasonable performance. No point in profiling at -O0. The nearest thing to "vectorization" is SoftWare Pipelining, which doesn't happen until -O2 (equivalent to default) is invoked. -O1 already invokes risky optimization, unless -mp is added. I think the -zero switch is available now, in case you have failed to initialize important data.
-O1 -mp is a reasonable option for minimum code size, high probability of working, and reasonable compile time. Unless you add -save -zero, it would already depend on your source complying with Fortran standard.

I have examples of up to a few thousand source lines where gprof works with current ifort, as well as larger examples where it fails. It is known not to work at all with earlier versions of Intel Fortran (e.g. prior to March 2004), or with older versions of glibc. glibc-2.2.4-32.15 or newer are definitely OK. I've been told the correction came in around March 2003. If gprof doesn't work with gcc, you know you need some linux upgrades. Bug reports with small examples where gprof fails with current Intel compiler releases are appreciated, as it is quite important to me that it should be made reliable.

$ ifort -V
Intel Fortran Itanium Compiler for Itanium-based applications
Version 8.0 Build 20040716 Package ID: l_fc_pc_8.0.046_pl050.1
Copyright (C) 1985-2004 Intel Corporation. All rights reserved.

ifort: Command line error: no files specified; for help type "ifort -help"
$ cat test-single.f90
! Here's an even more simple test case.
! When I compile this with "ifort -p -o test-single test-single.f90", the
! output is 0.0000000E+00

module myupdate
subroutine update(v)
real, intent(inout) :: v
v = v+5.
end subroutine update
end module myupdate

program main
use myupdate
real :: val
val = 3.4
call update(val)
print *,val
end program main

$ ifort -o test-single test-single.f90
$ ./test-single
$ ifort -p -o test-single test-single.f90
$ ./test-single
[zfkts@byzrzd Anneal-Test]$ rpm -q glibc
[zfkts@byzrzd Anneal-Test]$ uname --s -r -v -m -o
Linux 2.4.21-20.EL #1 SMP Wed Aug 18 20:30:22 EDT 2004 ia64 GNU/Linux

Leave a Comment

Please sign in to add a comment. Not a member? Join today