Forum Jump

Select Group :
Select Forum :
Sorted By :
Sort Order :
From The :
 
Thread Tools  Search this thread 
intel@karancevic.com
Total Points:
647
Status Points:
147
Brown Belt
August 22, 2008 8:07 AM PDT
AMD vs Intel numerics re-visited

Hello...

I cannot get my managers to let go of this one.  Last time I tested, my code compiled with IVF was giving me slightly different answers on AMD vs Intel CPUs.  We did not have this issue with the CVF.  Has Intel done anything recently to remedy this problem?  Here's a quick guide to my compiler settings....Thanks!

Nick

 

32-Bit

·         Fortran -> Optimizations -> Disable Optimizations (Release Configuration Only, mainstream versions of our code only).  This option is due to 387 FPU, to be re-visited when compiling native 64-bit code.

·         Fortran -> Code Generation: Enable Recursive

·         Fortran -> Data:  Initialize to Zero “Yes”

·         Fortran -> Floating point: FPE0 (crash on NaN)

·         Fortran -> Runtime: Generate Traceback, Check array and string bounds (this enables the DOS version to return the line and subroutine where the crash occurred).

·         Fortran -> Libraries: Ensure that “Multithreaded” is selected, not “Multithreaded DLL”; or, that “Multithreaded Debug” is selected, not “Multithreaded Debug DLL”.

·         Linker -> Enable Incremental Linking: No.  (See “Traceback” above)

 

tim18
Total Points:
66,417
Status Points:
66,417
Black Belt
August 22, 2008 9:45 AM PDT
Rate
 
#1

If you are interested in normal SSE2 results, the debug x87 results will not be relevant, unless they expose an outright error in your SSE2 results. 

In order to avoid SSE single precision instructions which differ in numerical results among various families of AMD CPUs, you should set /Qprec-div /Qprec-sqrt.  These options are included in /fp:precise or /fp:source.  As the Intel CPU families introduced over the last year have excellent performance for IEEE accurate divide and sqrt, there is more reason now to use these options.

Those /fp options also prevent auto-vectorization optimizations where results vary slightly with data alignment, and those where math library functions differ slightly between Intel and AMD.

You must also take care to use the same /Qftz setting; those /fp options set /Qftz-, which you can undo by following them with /Qftz.  You may want to test your application both with /Qftz and with /Qftz- (for compilation of the main program).

I suppose you must set some of these options under additional settings.

If your source code doesn't initialize data correctly, the Initialize to Zero can't be depended upon to avoid problems, including possible differences between Intel and AMD, as well as differences between debug and optimized mode.  You would have had the same problem with CVF if you set threading compatible options.

Another step which you would require to maintain a correct comparison between CVF and ifort would be to set the float consistency option in CVF and /assume:protect_parens in ifort.



Steve Lionel (Intel)
Total Points:
112,121
Status Points:
112,121
Black Belt
August 22, 2008 9:55 AM PDT
Rate
 
#2 Reply to #1
The reason you did not see this with CVF is that CVF had no support for SSE/SSE2 floating point. It always used the X87 instructions.





Intel Software Network Forums Statistics

8290 users have contributed to 31236 threads and 99111 posts to date.
In the past 24 hours, we have 7 new thread(s) 20 new posts(s), and 30 new user(s).

In the past 3 days, the most popular thread for everyone has been comparison cilk++, openmp, pthreads first results The most posts were made to comparison cilk++, openmp, pthreads first results The post with the most views is Very amusing...  Escalated as

Please welcome our newest member zq.x