Retooling Exceptions for Parallelism - Part 2

My previous blog discussed exceptions as alternative control flow versus exceptions as alternative data values.  Here I'll take that notion further and sketch how I think the TBB task scheduler should be deal with exceptions.

A quick review of the TBB task scheduler. It's the low-level engine that drives the high-level templates like parallel_for. The high-level templates are designed for convenience. The task scheduler is designed for speed, sometimes at the expense of convenience. It's low-level, after all, and we didn't want to penalize speed across the board for some convenience that only a few clients might use. Users define a task's work by overriding the virtual method task::execute().
Currently, if execute() throws an exception, the effect is undefined. "Undefined", as typically used in standards parlance, means anything can happen. Like crash your program or beam flying pigs into your room. So up to now, it's been the programmers job to wrap the body of execute() in try-catch if exception safety is an issue. We've gotten negative comments on this.  Some programmers want exception safety.  Others claim exception safety is so complex that the cure is worse than the disease, and leads to a false sense of security.. But nonetheless, since C++ offers the option of being exception safe, we should give users better support for making TBB tasks exception safe.
Exceptions are a difficult subject, even for sequential programs. Exceptions were a late addition to C++, presented in Stroustrup's Annotated Reference Manual (ARM) as an experimental feature.  I implemented most of the exception optimizations in KAI C++, and so am familiar with the difficulties.  That said, the rest of this blog proposes how to improve exception-handling support in the TBB task scheduler.
There's two patterns to consider, blocking-style and continuation-passing style.  Let's start with blocking style, because it is easier to understand. Suppose task C has spawned two tasks A and B, like this:

/ \

If A throws an exception, C's pending "wait" should act as if it threw the exception, after all children have completed. It's tempting to say that B should be cancelled immediately. But asynchronous cancellation invites a lot of problems. Having cancellation points, as in pthreads, might be a solution. For for now, I'd like to keep things simple. Cancellation usually requires some high level coordination, something above what a low-level task is capable of. Maybe later TBB can have a notification hook to class task so that C can be notified that an exception is coming its way, and C can do the cancellation. But that's an additional complex frill to be added later [2].

If both A and B throw an exception, then one of the exceptions should be chosen. In principle, both could be glued together into some kind of list-of-exception object, but I doubt that's of much help to real code, and probably a gratuitous resource hog.  For example, remember all 1000 iterations of a loop that threw an exception seems like overkill.  I'm inclined to let the first exception detected propagate, and suppress any later exceptions that collide with it in the task tree.
Continuation-passing style is trickier. Consider the earlier picture, but where C is not currently waiting, because it is a continuation task that will start running after tasks A and B complete. Suppose A and B complete, but at least one threw an exception. C.execute() is not yet running, so there is no obvious point from which the exception should be thrown. The right thing to do is to skip C.execute() and simply destroy C. Maybe later TBB can have notification hooks so that C can catch the exception. The hook would be a virtual method, say task::handle_exception(), that would be called instead of execute() when a child throws an exception. But to keep the first version simple, let's omit that. The Resource Acquisition Is Initialization (RAII) idiom probably suffices for cleanup in common cases. Let the destructor of C do any necessary cleanup.
Is this a plausible design? Since the task scheduler is the engine that drives the high-level templates, we have to ask if the proposal enables us to write an exception-safe parallel_for. I think it does.  Below is the code where the TBB 2.0 parallel_for does its work. I've added comments.

     template<typename Range, typename Body, typename Partitioner>
task* start_for<Range,Body,Partitioner>::execute() {
if( my_partitioner.should_execute_range(my_range, *this) ) {
// Base case of recursion.
my_body( my_range );
return NULL;
} else {
// Recursive case. Build continuation task c.
empty_task& c = *new( allocate_continuation() ) empty_task;
// Child a is the recycled this, and hence is not named a explicitly.
// Create and spawn child b
start_for& b = *new( c.allocate_child() ) start_for(Range(my_range,split()),my_body,Partitioner(my_partitioner,split()));
// Child a bypasses scheduler.
return this;

It is NOT exception-safe under the proposal.  In particular, if the line with allocate_child() throws an exception, child b is not created. This would leave c.ref_count==2, but with only one child (the recycled this).  Furthermore, even that child will never execute, because the code is requesting execution via scheduler bypass (return this). 
But the proposal gives us the option to make it exception safe, by rewriting the else part like this: 

             empty_task& c = *new( allocate_continuation() ) empty_task;
start_for* b;
try {
b = new( c.allocate_child() ) start_for(Range(my_range,split()),my_body,Partitioner(my_partitioner,split()));
} catch( ... ) {
return this;

This fragment is exception-safe because:

    • If allocate_continuation() throws an exception, then there
      is nothing to clean up.

    • If any code inside the try block throws an exception, then the catch
      clause cleans up c, and rethrows.

    • Code after the try block cannot throw an exception.

This example shows that adding the exception-propagation mechanism to TBB does
not automatically make code exception-safe, but at least enables
it.  That should be no surprise, because the same is true of C++.
- Arch


[1] C++ does not offer a way to capture an exception on one thread and rethrow it in another. Thus the mechanism described will have to be limited to rethrowing some kind of summary of the original exception, or limited to exceptions that have some kind of "movable" property. See comments in previous blog.
[2] C++ exception handling typically operates in two phases:

    1. Searching the stack for an appropriate handler

    1. Unwinding the stack to that handler

In a parallel environment, I suspect that the first step should be done before all children complete, and the second step done after all children complete. When the appropriate handler is found, a "prehandler" should be invoked. The prehandler could do cancellation.
[3] A task is automatically deleted after its method execute() returns. If such a task throws an exception, it makes sense to automatically delete it too. But a task is not automatically deleted if it has been subjected to a recycle_.. method. That raises the question of what to do if execute() throws an exception after the task has been recycled. I think the answer is to automatically delete the task nonetheless, but do not have use we should probably delete it, as if the recycling never happened. We'll need some real use cases to drive this decision.

For more complete information about compiler optimizations, see our Optimization Notice.


Arch D. Robison (Intel)'s picture

Thanks for pointing out the erroneous duplication. I removed the call at the end of the line.

anonymous's picture

Did you mean to have two calls to "c.spawn(b)" in the first snippet (one at the end of the line)?

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.