ifort giving wacky results vs other compliers

ifort giving wacky results vs other compliers

I just got a new Macbook Pro (64-bit, 2 cores) and thought that I would try to take advantage of intel's compiler (v 10.0.016). I have a modest size fluid dynamics code that I am trying to get to run with ifort. The code has been running on intel-64 arch. using gfortran and has also been running on powerpc 32-bit arch. using both gfortran and absoft compilers. The code has run on different machines and we are pretty damn sure that the code is doing what it should. We can test it against other codes and we can compare it to the results of simple pencil and paper calculations.

However, ifort gives completely different results than every other machine/compiler combo that I have tried. This is true even when I compile it with -O0. The code does not produce any compile errors or warnings nor any runtime errors. It just turns out that the results are wrong.

But not all the results. It doesn't seem to be a precision issue. All of the floating-point data is defined by passing in the same precision through a module, i.e.,

my_precision = selected_real_kind(p=15)

and real data is define (for example) via,

Real(my_precision) :: a, b, c

Therefore (I think) there should not be any platform dependent precision problems. Also, anytime that the code reports on its progress and gives simple info, the results are correct. And, when I print out a few matrices that get defined as the code starts up, they are correct.

The big problem is that the code is getting the important calculations wrong. They are not blowing up or giving NaN, the data is reasonable but wrong. Pictures of the results do not look anything like each other.

Anyone have any ideas? I know next to nothing about the details of the million or so ifort flag, but I tried my best to get rid of any optimization for the purpose of testing. Does ifort do anything funny even with -O0?

15 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

If you have un-initialized variables or subscript range violations, -O0 won't necessarily fix it. You will have to look at ifort flags. My favorite tactic
ifort -help | less
and searching for likely text strings, abbreviates this task.

My standard advice is to debug or add code to display intermediate values as the computations progress and figure out where they diverge from the results you are expecting. Offhand I cannot think of switches that would help you here.

If you need help, please contact Intel Premier Support.

Steve - Intel Developer Support

Geoffrey,

The problem may be a precision problem due to the following issues:

1) On the compilation you may need to check/set the option for default real kind size.

2) One instance may be using FP87 the other using SSE

3) The intrinsic and library functions that have two forms xxx and Dxxx (e.g. SQRT and DSQRT) may be treated differently by the different vendors compilers with regard to calling argumnet type and result destination type. In the case of SQRT it may be returning a REAL(4) when called with a REAL(8) argument.

4) Constants/data such as 0.1 when compiled with default real(8) may be stored as real(4).

5) Constants with E when compiled with default real(8) may be stored as real(4).

There may be other issues too.

Jim Dempsey

www.quickthreadprogramming.com

Geoffrey,

I also forgot to mention one of those "Doh!" situations where the runtime input file(s) in one of the instances is different from the runtime input file(s) in the other. e.g. is current directory a) project directory, b) execution file directory, c) prior current directory, d)...

Jim Dempsey

www.quickthreadprogramming.com

Jim's suggestions triggered a thought in my mind. The Intel Mac compiler always uses the SSE registers for floating point, since all Intel-based Macs have SSE3 capability. This means that operations are carried out in declared precision. I would expect this to be similar to PowerPC but perhaps not to other x86 systems using default compiler options.

Steve - Intel Developer Support

Geoff,

On the Intel-based Macs, declared precision is used. On a generic IA-32 based system (typically PC but doesn't have to be), the default is usually to use the X87 instructions for floating point (compatible with Pentium II). These use an extended precision format and single precision operations tend to be done in double, as switching precisions is slow. This can cause odd results as the exact answer you get depends on when the intermediate result gets rounded to declared precision. This is why PC users sometimes get results that are different (often "better") from those on other platforms such as SPARC.

On the Intel-based Mac, the SSE instructions are (I think) used for most everything and these always operate in declared precision.

I don't remember offhand what PPC does. It might use double for everything.

Steve - Intel Developer Support

Have you experimented with the floating point options

/Qpc32 set internal FPU precision to 24 bit significand
/Qpc64 set internal FPU precision to 53 bit significand (DEFAULT)
/Qpc80 set internal FPU precision to 64 bit significand
/QIfist[-] enable/disable(DEFAULT) fast float-to-int conversions
/Qrcd same as /QIfist
/Qrct set internal FPU rounding control to truncate
/rounding-mode:chopped
set internal FPU rounding control to truncate
/Qprec improve floating-point precision (speed impact less than /Op)
/Qfp-port[-] round fp results at assignments & casts (some speed impact)
/Qprec-div[-] improve precision of FP divides (some speed impact)
/Qprec-sqrt[-]
determine if certain square root optimizations are enabled
/Qcomplex-limited-range[-]
enable/disable(DEFAULT) the use of the basic
algebraic expansions of some complex arithmetic
operations. This can allow for some performance
improvement in programs which use a lot of complex
arithmetic at the loss of some exponent range.
/Qftz[-] enable/disable flush denormal results to zero
/Qssp enable software-based speculative pre-computation
/fp: enable  floating point model variation
except[-] - enable/disable floating point semantics
fast[=1|2] - enables more aggressive floating point optimizations
precise - allows value-safe optimizations
source - enables intermediates in source precision
strict - enables /fp:precise /fp:except, disables contractions, enables
property to allow for modification of the floating point
enviro nment
/[no]fpconstant extends the precision of single precision constants assigned
to double precision variables to double precision

I had experienced a consistancy problem on Windoz where constants containing repeating binary fractions were stored in single precision when used in expressions of double precission. This caused errors in 6th or 7th place when computation equired accuracy to 14th or 15th place. The problem would not have been noticed except for the accumulation of errors effect.

On a completely different tract

Are you compiling with IMPLICIT NONE?
(treatment of uninitialized variables may be different)

Are you using COMMON's or Modules?
(Various implementations of Fortran treat ambiguous use of COMMON differently)

Are you assuming SAVE or AUTOMATIC?
(may effect results on 2nd and later use of uninitialized variables)

Have you run with /uninit (uninitialized variable check)?

Have you GenInterfaces/CheckInterfaces?
(verify argument passing is correct and consistent)

If you are calling 3rd party libraries there may be issues relating to user defined type structures where the programmer assumes sequencing, packing, and/or padding.
(may require you to explicitly state requirements in user defined types).

If you are calling C/C++ libraries strings and string buffers may require the trailing null padd. This might have been treated differently between systems.

Jim Dempsey

www.quickthreadprogramming.com

Many of those options apply only to x87 code. As Steve said, that option doesn't apply to Mac. Of course, the Windows spelling of the options doesn't apply to Mac or linux platforms.
Not all of the -fp-model options available in C are available in ifort. It's certainly good to try -fp-model precise in case there is a question about accuracy of aggressive optimizations.
The Fortran standard requires that constants be rounded to the expressed precision. ifort -fpconstant allows departure from that rule. There may be a similar option in your PPC compiler. As Steve said, the PPC may use extra (double) precision, with results similar to Windows x87 code.

>> The Fortran standard requires that constants be rounded to the expressed precision.

Tim

real(8) :: value
...
if(value .lt. 0.1) then

What is the expressed precision of 0.1? (with default real as Default for compiler)
What is the expressed precision of 0.1? (with default real as real(4))
What is the expressed precision of 0.1? (with default real as real(8))
What is the expressed precision of 0.1? (with /fpconstant)

I do not know about V10.0, as I haven't installed it yet.

But for various earlier versions of IVF the literal 0.1 is stored as a real(4) eventhough default real size is set to 8.

Due to .1 being a repeating binary fraction the fraction has to be rounded/truncated The real(4) value and real(8) values are different, with the real(8) value being a closer approximation.

The problem lies in when there is no expressed precision. i.e. there is a presumed precision.

Then there is

real(8) :: value_8
real(4) :: value_4
...
if(value_8 .lt. 0.1) call foo
if(value_4 .lt. 0.1) call foo

Is there one literal or two literalsfor 0.1?
If one then what size?

Jim Dempsey

www.quickthreadprogramming.com

0.1 is "default real". Always. If you set the default real size to be 8 bytes then that's what you get - double precision. Jim, if you have a test case that shows otherwise, I'd like to see it. My own experiments show that it works as advertised.

Steve - Intel Developer Support

That is the recommended method for specifying kinds.

Steve - Intel Developer Support

Very curious. I don't see anything obviously wrong with the code. If you can come up with a self-contained sample and options which make it fail, please let us know at Intel Premier Support. I tried some combinations and didn't see "01" appear.

Steve - Intel Developer Support

Geoffrey,

Check that these options are off or missing

/1, /Qonetrip execute any DO loop at least once

If set your ZeroPad loop will execute once when you expect it to not execute.

Or, explicitly insert and IF test for Length_m (both for zero length and length of 1).

I would prefer to redo the function to something along the lines of:

Character(80) Function DigitToString(m,ndigits) 
Implicit None
Integer, Intent(In) :: m, ndigits
Character(80) :: Buffer
If (m .GE. 10**ndigits) Then 
!! Put favorite error handler here
Else
! Write the integer to a character variable used as a buffer. 
Write (Buffer,*) m + 10**ndigits
! Remove extra '1'

DigitToString = Buffer(2:)
End If 
End Function DigitToString 

Jim Dempsey

www.quickthreadprogramming.com

I would prefer to use an I0 format rather than list-directed in the WRITE. There's too much implementation-dependent behavior in list-directed.

Steve - Intel Developer Support

Leave a Comment

Please sign in to add a comment. Not a member? Join today