Forum Jump

Select Group :
Select Forum :
Sorted By :
Sort Order :
From The :
 
Thread Tools  Search this thread 
taka2488
Total Points:
10
Registered User
July 7, 2009 7:20 AM PDT
floating point problem?.
Hi,

I use Intel fortran compiler.
My problem is the executed result of ivf (Intel Visual Fortran) project is different from that of cvf (Compaq Visual Fortran) project.
Both has the same code.

For example,

## sample code #######################################
PROGRAM float
P=0.9935059
P0=1.03323
C
P=AMIN1(500.0,P+P0)-P0
C
WRITE(6,*) P
END
######################################################

This result is

0.9935058 in ivf
0.9935059 in cvf
.

I think cvf result is currect.
Please teach me this problem.


cf, This ivf project was converted from cvf.
and
This is my lacktop specification.

OS : Windows Server 2003
CPU : Intel(R) Xeon(R) CPU E5335 @ 2.00GHz
tim18
Total Points:
66,397
Status Points:
66,397
Black Belt
July 7, 2009 10:00 AM PDT
Rate
 
#1
If you haven't read about the compile options for both ifort and CVF please do so.  With current 32-bit ifort, if you wish to evaluate simple expressions in double precision, as CVF did, /arch:ia32 should work.  It would be preferable to write what you mean in standard Fortran, e.g.
P=MIN(500.0d0,dble(P)+P0)-P0
(if you don't like Fortran 90), so that your result doesn't depend on the compile options.
You have used one bit of f77, so I won't try to show you an f66 version.
As mentioned further down, these compilers are evaluating your expressions at compile time, so run time settings will have no effect, and the current compiler is technically correct in doing what you asked rather than what you may have meant. 


Steve Lionel (Intel)
Total Points:
112,121
Status Points:
112,121
Black Belt
July 7, 2009 2:00 PM PDT
Rate
 
#2
I tried /arch:ia32 on this program, as well as /fp:source, but it didn't change the result.  I will comment that this computation is right on the edge of what is representable in single precision and it does not surprise me that a one-LSB difference might be seen between implementations.

I'll have to see what CVF is doing differently.





jimdempseyatthecove
Total Points:
34,847
Status Points:
34,847
Black Belt
July 7, 2009 2:45 PM PDT
Rate
 
#3 Reply to #2

Also, the diffrerence may be only in the print routine but not in the binary representation.

Try printing using an Z (hex) format

Jim

Jeff Arnold (Intel)
Total Points:
600
Status Points:
100
Brown Belt
July 7, 2009 2:52 PM PDT
Rate
 
#4 Reply to #2
I'll have to see what CVF is doing differently.


Without looking at any code generated by either IVF or CVF, I think CVF is carrying out the computation in extended precision in the x87 register set and IVF is doing the computation in single-precision (as the code specifies) in the SSE registers.

If one does all the arithmetic in IEEE single-precision, the result of (P+P0)-P0 is smaller than P by 1 lsb.  The reason is alignment/normalization shifts: P0 is shifted right one place in the addition and there is another right shift to normalize the sum.  Then there is a left shift of two places to normalize the result of the subtraction.

Obligatory reference to Goldberg's paper on floating-point arithmetic:  http://dlc.sun.com/pdf/800-7895/800-7895.pdf


Steve Lionel (Intel)
Total Points:
112,121
Status Points:
112,121
Black Belt
July 8, 2009 8:11 AM PDT
Rate
 
#5
When compiled with optimizations, both compilers do all the arithmetic at compile-time.  Here's what the instructions look like without optimization - I've done some annotation.

CVF:

fld dword ptr .bss8 [push P on FP stack]
fadd dword ptr .bss4 [add P0 to P]
fcom dword ptr .literal8 [compare sum against 500.0]
mov eax, 0 [this and next four instructions are the rest of the  AMIN]
fnstsw ax
and ah, 65
jne L$1
fstp st(0)
fld dword ptr .literal8 [store 500]
L$1:
fsub dword ptr .bss4 [subtract P0 from sum]
fstp dword ptr .bss8 [store P]

IVF:

fld DWORD PTR [_2il0floatpacket.2]  [push 500.0 on FP stack]
fld DWORD PTR [-12+ebp]  [push P on FP stack]
fld DWORD PTR [-8+ebp] [push P0 on FP stack]
faddp st(1), st  [add P and P0]
fcom [compare sum to 500]
fnstsw ax  [this and next 4 are the AMIN]
sahf ;5.1
ja L1 ; Prob 50% ;5.1
fst st(1) [copy sum to FP stack location 1]
L1: ; fstp st(0) ;5.1
fld DWORD PTR [-8+ebp] [push P0 on FP stack]
fsubp st(1), st [subtract P0 from sum]
fstp DWORD PTR [-12+ebp] [store P0]

I am not an expert on the instruction set and I have to admit I don't understand everything here, but it looks to me as if the arithmetic instructions are equivalent.  The only thing I can't see is what precision mode each compiler sets the X87 registers to.  When I debug, the CVF display shows me one additional digit in the FP registers - I don't know if this is relevant.

Nevertheless, Jeff correctly notes that if you do the computation in declared precision, IVF gets the correct answer. 



Melvin
Total Points:
170
Status Points:
120
Green Belt
July 8, 2009 9:07 AM PDT
Rate
 
#6 Reply to #4
Using Intel Visual Fortran 10.0.025 I get the correct answer of 0.9935059 which is the correct answer.


Steve Lionel (Intel)
Total Points:
112,121
Status Points:
112,121
Black Belt
July 8, 2009 9:12 AM PDT
Rate
 
#7 Reply to #6
Quoting - Melvin
Using Intel Visual Fortran 10.0.025 I get the correct answer of 0.9935059 which is the correct answer.
"correct" is a matter of interpretation here.  I'll agree that, arithmetically, with infinite precision, that is correct.  Computationally, with single-precision floating point, it is incorrect.



Melvin
Total Points:
170
Status Points:
120
Green Belt
July 8, 2009 9:46 AM PDT
Rate
 
#8 Reply to #7
"correct" is a matter of interpretation here.  I'll agree that, arithmetically, with infinite precision, that is correct.  Computationally, with single-precision floating point, it is incorrect.
Sorry if I'm the only one on this thread who doesn't understand this, (that's what I get for jumping in and compiling code) but why is the answer incorrect when using single-precision arithmetic? 


Steve Lionel (Intel)
Total Points:
112,121
Status Points:
112,121
Black Belt
July 8, 2009 9:50 AM PDT
Rate
 
#9 Reply to #8
Quoting - Melvin
Sorry if I'm the only one on this thread who doesn't understand this, (that's what I get for jumping in and compiling code) but why is the answer incorrect when using single-precision arithmetic? 
See Jeff Arnold's reply above.  It is a difficult topic that most do not understand, leading to occasional surprising results.



jimdempseyatthecove
Total Points:
34,847
Status Points:
34,847
Black Belt
July 8, 2009 10:18 AM PDT
Rate
 
#10 Reply to #8

Assume the "correct" answer were

1.234567891011121314151617181920

But single precision real(4) has limited precision capability which is +/- not quite 7 digits
The above resultant value cannot be completely held, only the rounded value can be held
1.23456?  where ? is neither 7 nor 8 but some value between 7 and 8.

Open the Windows calculator, enter 1/3, do you see or expect an infinite number of 3's?
2/3 do you expect to see an infinate number of 6's or all 6's or all but last as 6's and one last 7?

In each of the above cases, are you willing to accept the result as close enough?

Here is another reference on floating point storage

http://steve.hollasch.net/cgindex/coding/ieeefloat.html
 
Jim Dempsey




Intel Software Network Forums Statistics

8285 users have contributed to 31229 threads and 99107 posts to date.
In the past 24 hours, we have 9 new thread(s) 39 new posts(s), and 55 new user(s).

In the past 3 days, the most popular thread for everyone has been comparison cilk++, openmp, pthreads first results The most posts were made to comparison cilk++, openmp, pthreads first results The post with the most views is Very amusing...  Escalated as

Please welcome our newest member tvinni