Exception handling and OpenMP leads to segfault.

Exception handling and OpenMP leads to segfault.

In order to capture exception messages (from the what() method) and make sure that exceptions are not thrown past parallel OpenMP sections, the following pattern is used.

When testing the attached code using OpenMP paralellisation with the pattern, it seems to work fine all the time.
But when this pattern is run in a bigger application, it segfaults pretty much all the time:

bool abort = false; // shared variable that indicates an abort
string error_message; // shared variable that should contain the exception messages

#ifdef _OPENMP
#pragma omp parallel for collapse(2) schedule(dynamic)
#endif
for (int xl = m_area.second.m_begin; xl < m_area.second.m_end; xl += m_area.second.m_stride) {
  for (int sl = m_area.first.m_begin; sl < m_area.first.m_end; sl += m_area.first.m_stride) {
#ifdef _OPENMP
#pragma omp flush (abort)
#endif
    if (!abort) {
      try {
    throw std::runtime_error("Throwing some exception!"); // <-- this is usually a bigger mess which can throw
      } // if !abort
      catch (exception const & e) {
#ifdef _OPENMP
#pragma omp critical (VEL_ERROR_MESSAGE_WRITE)
#endif
    error_message += string(e.what()) + string("\n"); // <-- this is where the segmentation violation happens
    abort = true;
#ifdef _OPENMP
#pragma omp flush (abort)
#endif
      }
      catch (...) {
#ifdef _OPENMP
#pragma omp critical (VEL_ERROR_MESSAGE_WRITE)
#endif
    error_message = "Unknown exception!";
    abort = true;
#ifdef _OPENMP
#pragma omp flush (abort)
#endif
      }
    } // if (!abort)
  } // end of sl loop
 } // end of xl loop

#ifdef _OPENMP
#pragma omp barrier
#endif
if (abort) {
  throw runtime_error("Velocity precomputation encountered"
              " the following error: " + error_message);
}

If the bigger application is run in a debugger, the debugger reports a segmentation violation when assigning the exception string to the shared variable in the critical section. The stack trace is as follows: __kmp_release_lock <- __kmpc_end_critical <- ...537__par_loop0_2_1460 <-...

Is the pattern itself flawed?

Any advice or help on how to resolve the problem would be very much appreciated!

Best regards
Andreas

AttachmentSize
Downloadtext/x-c++src exception2.cpp1.43 KB
23 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I have two questions:

- Do you need #include "boost/lexical_cast.hpp" in the test case?
- Could you post command line options for Debug configuration?

Thanks.

Sergey,

The lexical_cast is not an essential part of the testing or the original program. This can be removed. I just thought it nicer to have a report on which thread is throwing.

The command line options for the debug build are:

icpc -static -g -traceback -w -vec_report3 -DMKL -DIPP -DOMP -D__PURE_INTEL_C99_HEADERS__ -openmp -O0 -fp-model precise -o exception2 exception2.cc

This is to stay in line with the original applications command line options. But as mentioned in the original post, I never saw the test exception2.cc code having the same issue as the big application.

Best regards
Andreas

>>...
>>#pragma omp critical ( VEL_ERROR_MESSAGE_WRITE )
>>#endif
>> error_message = "Unknown exception!";
>> abort = true;
>>...

Please try to increase OpenMP stack size for a thread in the real application since it looks like the stack corruption.

Sergey,

I have tried setting OMP_STACKSIZE and KMP_STACKSIZE, but I still get the segmentation violation. In addition, sometimes I get similar messages to:

*** glibc detected *** bin/Linux/2.6/x86_64_SSE4_c6/application: double free or corruption (!prev): 0x00007ffb28083ee0 ***

Sometimes it is more than just this line.

Should I also try to increase the normal program stack size?

Thank you very much for your help!

Best regards
Andreas

>>...Should I also try to increase the normal program stack size?

Yes. ( Stack Reserve and Stack Commit values ).

Try:

#ifdef _OPENMP
#pragma omp critical (VEL_ERROR_MESSAGE_WRITE)
{ // *** add this
#endif
    error_message += string(e.what()) + string("n"); // <-- this is where the segmentation violation happens
    abort = true;
#ifdef _OPENMP
} // *** add this
#pragma omp flush (abort)
#endif

Jim Dempsey

www.quickthreadprogramming.com

Do the same for the catch(...) critical section.

Jim Dempsey

www.quickthreadprogramming.com

Jim,

Unfortunately adding the brackets does not help.

Sergey,

Could you explain the difference between stack reserve and stack commit size? I am running those tests on a Linux system and I only know about a generic stack size.

Thank you very much for all your help!

Best regards
Andreas

As you're not interested in Windows, you can ignore the reserve/commit stuff and read up on how the shell of your choice sets stack limit.

Here are all stack related options for GCC C++ compiler I found:
...
-fstack-check Insert stack checking code into the program
-fstack-limit This switch lacks documentation
-fstack-limit-register= Trap if the stack goes past
-fstack-limit-symbol= Trap if the stack goes past symbol
-mstack-arg-probe Enable stack probing
...

Andreas,

OpenMP and STL are mixed at the moments in your test case. So, could you try to check the test case without using the variable error_message ( string type )? That is, without STL based outputs to the console and use printf CRT-function instead.

Here is a modified piece of code posted by Jim:

#ifdef _OPENMP
#pragma omp critical ( VEL_ERROR_MESSAGE_WRITE )
{ // *** add this
#endif
printf( "%s\n", e.what() ); // <-- this is where the segmentation violation happens
abort = true;
#ifdef _OPENMP
} // *** add this
#pragma omp flush (abort)
#endif

Sergey,

I have now tried increasing the stack size (both OpenMP and the process limit) by setting OMP_STACKSIZE 16m and limit -s unlimited.
This does not help. The application still segfaults at this point.

I am also not sure if this is related to the stack size, as I can see no recursive calls here and I do not think that I have big objects on the stack. The machine I am running this on a 16 core machine, but an 8 core machine shows the same behaviour.

I have also tried the printf instead of the std::string operation. This does not help either. The exception message is printed to the console several times, but then the segfault happens. ( SIGSEGV Segmentation Violation signal)

And I also tried compiling with the above stack related compiler options. No luck with that either.

Best regards
Andreas

If I run the application in gdb, I get the following output (not sure if this is useful though):

[New Thread 0x2aab60d3a700 (LWP 28544)]
[New Thread 0x2aab61778700 (LWP 28545)]
[New Thread 0x2aab6c400700 (LWP 28546)]
[New Thread 0x2aab6c801700 (LWP 28547)]
[New Thread 0x2aab6cc02700 (LWP 28548)]
[New Thread 0x2aab7c400700 (LWP 28549)]
[New Thread 0x2aab84801700 (LWP 28550)]
[New Thread 0x2aab84c02700 (LWP 28551)]
[New Thread 0x2aab90400700 (LWP 28552)]
[New Thread 0x2aab90801700 (LWP 28553)]
[New Thread 0x2aab9c400700 (LWP 28554)]
[New Thread 0x2aab9c801700 (LWP 28555)]
[New Thread 0x2aaba8400700 (LWP 28556)]
[New Thread 0x2aaba8801700 (LWP 28557)]
[New Thread 0x2aabb4400700 (LWP 28558)]
[New Thread 0x2aabb4801700 (LWP 28559)]
Throwing some exception!
Throwing some exception!
Throwing some exception!
Throwing some exception!
Throwing some exception!
Throwing some exception!

Program received signal SIGSEGV, Segmentation fault.
0x000000000125186a in __kmp_acquire_lock ()

and

(gdb) backtrace
#0  0x000000000125186a in __kmp_acquire_lock ()
#1  0x000000000123bca4 in __kmpc_critical ()

.

.

.

Thanks for the feedback.

>>The command line options for the debug build are:
>>
>>icpc -static -g -traceback -w -vec_report3 -DMKL -DIPP -DOMP -D__PURE_INTEL_C99_HEADERS__ -openmp
>>-O0 -fp-model precise -o exception2 exception2.cc

Here are a couple of notes about command line options for the Debug configuration:

1. Option -w suppresses warnings. Turn this on, that is remove it, and review all warnings.

2. __PURE_INTEL_C99_HEADERS__

Wyy do you need it? Try to remove it just for verification that it is not related to the problem.

3. -fp-model precise

Could you try a different Floating Point model?

4. -static

Could you try dynamic linking?

If nothing helps than a reproducer of the problem will be needed. Could you create it? Thanks in advance.

>>Program received signal SIGSEGV, Segmentation fault.
>>0x000000000125186a in __kmp_acquire_lock ()

 The two likely causes of this are:

a) Code in __kmp_acquire_lock () pushed something over the edge of available stack.
b) The pointer to the lock variable is in never never land (address not mapped to your process's virtual memory)

As to what instigates these conditions... you've got the code.

Jim Dempsey

www.quickthreadprogramming.com

Does icpc have an analog to ifort's -heap-arrays option?  That option in the fortran compiler lets you set a threshold at which the compiler will allocate certain arrays on the heap vs the stack and has solved problems with openmp and segfaults for me in the past.

>>...Does icpc have an analog to ifort's -heap-arrays option?

No and it is a known issue ( discussed a couple of times ) that some number after -heap-arrays option ( like -heap-arrays 1024 ) is ignored.

Citação:

Sergey Kostrov escreveu:

>>...Does icpc have an analog to ifort's -heap-arrays option?

No and it is a known issue ( discussed a couple of times ) that some number after -heap-arrays option ( like -heap-arrays 1024 ) is ignored.

That ifort allocation size option allows only fixed size allocations (size known at compile time) to go on heap.  More useful might be an option which puts small fixed size allocation on stack and larger ones on heap.

Without the numeric option, the current ifort heap-arrays option puts all allocations on heap.  I suppose Sergey meant that when you give a numeric option, it leaves all variable size allocations on stack.

The programmer can change:

double foo[someBigSize];

to

vector<double> foo(someBigSize);

For those stack allocations that are exceedingly large

Jim Dempsey

www.quickthreadprogramming.com

I did not have too much time to look more closely into debugging this issue. One thing I tried (successfully) is to use an array of strings (std::vector< std::string >) to collect the exception messages. The vector itself is a global array and has one element for each thread. Using the omp_get_thread_num function, the catch part then assigns the exception message from what() to the thread's element in the string vector. That seems to work.

Regarding the elements on the stack, I am already using dynamic (heap) allocation of pretty much all data. At least in the parts of the code that I have immediate control over. I will check if there could be problem with that.

Jim: Is there any way to see if the stack gets corrupted in the loop? Any tool that might help?

Thank you very much for all your help!
Best regards
Andreas

Intel C++ has a runtime check for stack corruption. While it will not catch all errors, it should catch those errors that disturbe memory in an inter-frame gap (trashing signature bytes stored there for this purpose).

One particular problem you should look at with respect to your code, is if you are using functors (a function you declare in line in your code and assign to variable/functor, then use this as argument to later statement). The be careful about what you pass by reference [&] or by value [=], default is reference. If you intend on (require) each thread having value then do not pass by reference.

Jim Dempsey

www.quickthreadprogramming.com

Here is a set of Intel C++ compiler options related to stack verifications, etc:

/Qfp-stack-check - Enable fp stack checking after every function/procedure call

/RTCs - Enable stack frame runtime checks

/Qcheck-pointers-dangling: - Specifies what type of dangling pointer checking occurs. Possible values are:
none - Disables dangling pointer checking. This is the default.
heap - Check dangling references on heap.
stack - Check dangling references on stack.
all - Check dangling references on both heap and stack.
This switch is only valid with Intel(R) Parallel Studio XE

/check:[,,...] - Check run-time conditions.
keywords: [no]conversions, [no]stack, [no]uninit

/Qsfalign8 - May align stack for functions with 8 or 16 byte vars (DEFAULT)
/Qsfalign16 - May align stack for functions with 16 byte vars
/Qsfalign - Force stack alignment for all functions
/Qsfalign- - Disable stack alignment for all functions

/F - set the stack reserve amount specified to the linker

Leave a Comment

Please sign in to add a comment. Not a member? Join today