TBB causes segfault in exception handling (OS X / gcc 7.2)

TBB causes segfault in exception handling (OS X / gcc 7.2)

(I also posted this on StackOverflow because I am not sure if this is really a TBB issue: https://stackoverflow.com/questions/47555685/intel-tbb-causes-segfault-i...)

This correct example causes a segfault in the std::cout line with specific compilation parameters:

#include <exception>
#include <iostream>

int main(int argc, char** argv) {
  try {
    throw std::runtime_error("BLA");
  } catch (const std::exception& exception) {
    std::cout << exception.what() << std::endl;
  }
}

It only occurs when compiling it on OS X with gcc-7.2 and Intel TBB:

g++-7 exception.cpp /usr/local/lib/libtbb.dylib

Since the OS X version of TBB is built with clang and linked against libc++, I tried building it with g++-7 from the repo. That does not change anything. Also, the order of arguments does not matter.

On Ubuntu, everything is fine.

This feels like a bug somewhere, but I am not sure where. Any ideas on how to work around this or where to file a bug?

Some more information that might be helpful:

Stacktrace

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x0000000100000da1 a.out`main(argc=1, argv=0x00007fff5fbff880) at exception.cpp:8
   5      try {
   6        throw std::runtime_error("BLA");
   7      } catch (const std::exception& exception) {
-> 8        std::cout << exception.what() >> std::endl;
   9      }
   10   }

Versions

[:~/tmp] $ g++-7 --version
g++-7 (Homebrew GCC 7.2.0) 7.2.0
[:~/tmp] $ brew info tbb
tbb: stable 2018_U1 (bottled)
[:~/tmp] $ sw_vers -productVersion
10.12.6

ASAN

ASAN:DEADLYSIGNAL
=================================================================
==9698==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000104c03b22 bp 0x7fff5affc940 sp 0x7fff5affc900 T0)
==9698==The signal is caused by a READ memory access.
==9698==Hint: address points to the zero page.
    #0 0x104c03b21 in main (/Users/markus/tmp/./a.out+0x100000b21)
    #1 0x7fffa9734234 in start (/usr/lib/system/libdyld.dylib+0x5234)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/Users/markus/tmp/./a.out+0x100000b21) in main
==9698==ABORTING
Abort trap: 6
11 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hello, Markus.

This configuration does not have official support in Intel TBB: https://software.intel.com/en-us/articles/intel-threading-building-block...

Investigation of this problem will be with low priority. But you may investigate it self and contribute the fix: https://www.threadingbuildingblocks.org/submit-contribution

Thank you,
Alexey M.

Could you please check if the issue goes away in case you throw and catch an exception that is not std::exception or its derivative?

Also, does it work if you use Clang instead of GCC, or a different version of GCC?

* Throwing an std::string works.

* g++-6 has the same issue

* With clang, I have no issues.

Then this is likely related to the fact that TBB publicly exports certain symbols (typeinfo etc.) for standard exceptions that might be thrown by our binary. For backward compatibility reasons, we want to control the TBB ABI - and therefore we explicitly specify which symbols are exported and which are not. In past we found out that we need to export the exception-related information so that the exceptions thrown by TBB can be correctly caught in the application.

As far as I understand, exception support is pat of [platform-specific] C++ ABI conventions. For code built with different compilers to work well together, all compilers should properly implement these conventions. So in this case the problem can be in any of:

  • the compiler that we use to build TBB. For OS X, it's either Clang or Intel Compiler; you can check that with `strings libtbb.dylib | grep BUILD_COMPILER`; or just tell me which version of TBB you use and I will find out.
  • the compiler that you use to build the application, i.e. GCC.
  • TBB - in case we do not export some symbols that Clang does not need but GCC does.

I would first use `nm -g a.out` for binaries built with GCC and Clang, and check the difference. Then you can try patching and rebuilding TBB to make sure that all exception-related symbols are exported, and see if it helps.

Thank you for taking the time to look into this.

This is the compiler used to build TBB:

[:~/tmp] $ strings /usr/local/lib/libtbb.dylib  | grep -A 1 BUILD_CLANG
TBB: BUILD_CLANG
Apple LLVM version 9.0.0 (clang-900.0.37)

I uploaded the diff between the gcc (left) and the clang build (right) here: https://pastebin.com/X6uSPifL
Can't see anything related to exceptions there.

With your comments about the explicit symbol exports in mind, I had a look at mac64-tbb-export.lst and found this line to be the likely culprit:

__TBB_SYMBOL( _ZTISt13runtime_error )

If I remove just that line and rebuild tbb (with either gcc or clang), the original example works. The demangler tells me that this is the typeinfo for

std::runtime_error

.

Can you please provide the output of `strings /usr/local/lib/libtbb.dylib | grep TBB:`?

Unfortunately removing something from the export list is not the solution, because in this case I'm afraid the exceptions thrown by TBB will not be properly caught by the application. We need to figure out if adding something to the export list may help. Can you find (e.g. with nm) all the symbols related to runtime_error in the application binary, and try adding those into the TBB export list? One thing to know is that symbols start with double underscores on OS X but put with a single underscore into the list (the second is automatically added by __TBB_SYMBOL macro).

Sure, I was not suggesting that one should remove that row, just that it relates to the problem.

Looking at the output of `nm -g withouttbb` (https://pastebin.com/cFh3wvRm) I found that `__ZNSt13runtime_errorC1EPKc` is not in `mac64-tbb-export.lst`. The demangler says that this is the constructor:

_std::runtime_error::runtime_error(char const*)

Simply adding that to the lst file is not an option:

dyld: lazy symbol binding failed: Symbol not found: __ZNSt13runtime_errorC1EPKc
  Referenced from: /Users/markus/tmp/./test (which was built for Mac OS X 10.12)
  Expected in: /usr/local/lib/libtbb.dylib

dyld: Symbol not found: __ZNSt13runtime_errorC1EPKc
  Referenced from: /Users/markus/tmp/./test (which was built for Mac OS X 10.12)
  Expected in: /usr/local/lib/libtbb.dylib

Here is the requested string list:

[:~/tmp] $ strings /usr/local/lib/libtbb.dylib | grep TBB:
TBB: VERSION
TBB: INTERFACE VERSION
TBB: BUILD_DATE
TBB: BUILD_HOST
TBB: BUILD_OS
TBB: BUILD_KERNEL
TBB: BUILD_CLANG
TBB: BUILD_XCODE
TBB: BUILD_TARGET
TBB: BUILD_COMMAND
TBB: TBB_USE_DEBUG
TBB: TBB_USE_ASSERT
TBB: DO_ITT_NOTIFY
TBB: %s

I checked which symbols related to runtime_error are used in TBB object files, and only found constructors (which we cannot export - those are not defined in TBB, and not virtual), destructor, and typeinfo. I.e. even if some symbol is missed, it is not clear which one.

There is another way to check if TBB is guilty. You can try building with Clang a small shared library with a single function that throws runtime_error. Do not change or restrict symbol visibility in any way - so it is up to the toolchain to decide which symbols are needed. And then link your GCC-built test with that library instead of TBB.

If in this case the application keeps failing, I would say the problem is in GCC (since it is the stranger there, not Clang). If it works, you can see which symbols are exported from the small shared library, and can try adding those to TBB export lists.

On a side note, looks like grepping strings from macOS binaries requires `-A 1` like you did for BUILD_CLANG before; otherwise the useful values are not seen. Anyway, now I do not think that info would be of much help.

Building my own dylib and linking against it works: https://pastebin.com/tcp3NsWq
Interestingly enough, this *does* export the constructor.

I also found that throwing with "throw new" and catching the pointer does not cause a segfault.

Also, I found this page: https://libcxxabi.llvm.org/
Maybe the last paragraph helps?

The FAQ at libcxxabi page explains why the type_info and destructor should be visible in a dynamic library - without that, it would not be possible to ensure that exception classes used in different modules are actually of the same type.

The test does not fail in the same way with your small library, but still does not work correctly. Since runtime_error is a derivative of std::exception, it should have been caught in main() - but it is not; terminate() is called instead:

libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: foo

I already tried to add the constructor to the export list, it did not work. I think that's because the compiler uses it as an external symbol and does not instantiate into the shared library.

Also I think that TBB binaries are likely built with Intel Compiler, not Clang. It can be checked with 

strings /usr/local/lib/libtbb.dylib  | grep -A 1 BUILD_COMMAND

All in all, I tend to think that compilers implement the ABI somewhat differently, and incompatibilities cause the application failures. Which of the compilers to blame is hard to say. 

Leave a Comment

Please sign in to add a comment. Not a member? Join today