Unexpected NaN as return value

Unexpected NaN as return value

Hi,

I have a multi-threaded application involved in scientific calculation on a Xeon workstation. It compiles well with icc70 on RedHat Linux 7.2. But it renders wrong result.

When I trace it with gdb , I found one function call returns (NaN) instead of a double value I expected. However, if I step in this function, everything turns out fine and it returns a double.

After some research, I noticed an article talking about a bug in gnu libc, which sporadically sets wrong floating point flag. So I enable the program to trap SIGFPE. It does not really help because the program stopped very soon at some float assignment statement.

I don't know if you guys met this before. Any comment? What else should I do? I am not sure if the program works in gcc since it is not a simple effort to gcc it.

thanks,
-Shan

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

We have customers report issues with the Floating Point (FP) stack being corrupted by their code. The problem is typically different function declarations involving floating point numbers that are different than the function definition in DIFFERENT COMPILATION UNITS. This often happens on legacy code that has been running for a very long time and many different platforms, so one is under the false impression that the code is written correct, although it did execute correctly on a large number of platforms. This is due to differences in the floating point implementation.

An example will help illustrate the problem:

file foo.c:
badFunction( float a );
callBad()
{
float value = 10.
badFunction( value);
}

file bar.c:
float badFunction (float a)
{
return a;
}

The problem is badFunction returns a floating point number on the floating point stack, and function callBad doesn't pop the value off of FP stack, causing the FP stack to be corrupted. The program doesn't crash at this point, but will crash with a FP exception when the FP stack overflows (it holds 8 FP numbers). So it is very difficult to find the function that is causing the FP corruption, typically it is not the function the program crashes in.

The next major release of the Intel compiler will provide an option to automatically generate code to check for functions that corrupt the FP stack. I suggest you enter a support issue on https://premier.intel.com and we can provide a work around until this new feature is released.

You can see that the above example is a coding error, but on some architectures it functions correctly.

Leave a Comment

Please sign in to add a comment. Not a member? Join today