Exception Handling and Cancellation in TBB - Part III – Use cases with nested parallelism

Now that we’ve familiarized ourselves with the simple scenarios in the previous part of this post, let’s consider more complex, but nevertheless ubiquitous use case of nested parallelism (maybe it is even more widespread that non-nested usages).

When we considered a standalone algorithm, canceling any of its tasks naturally caused unconditional cancellation of all its other tasks. By the way, just in case, I tend to use terms “algorithm” and “task hierarchy” interchangeably, so do not be confused when you come across them, they both mean a bunch of (closely) related tasks having (possibly indirectly) a common root task.

In case of nested algorithms we may (and surely will) want more control over what happens when cancellation request arrives to the different parts of the hierarchy. To relieve you from the burden of getting into the details of a bunch of invented examples pretending to real-lifeness, lets consider instead a simple abstract use case:

Parallel algorithm A runs nested algorithms B[x] (x = 1, …, P) in each of its parallel branches (index x discriminates different simultaneously running instances of B).

Our usage model of cancellation in such hierarchies was primarily influenced by exception handling requirements - remember, any unhandled exception cancels the group, to which the failed task belongs. So just imagine that exceptions happen in different parts of this simple nested hierarchy. The following basic situations are possible:

    1. One of A tasks is cancelled. This means that the rest of A’s tasks is cancelled as well. But what to do with the instances of the nested algorithm B, which are already running (or will be started by the currently running A tasks). Obviously we have two options:

        • a. Leave all B[x] running until completion.

        • b. Cancel all B[x] as well.

    1. One of B[x] tasks is cancelled. Besides of cancelling all the tasks belonging to B[x], the following options exist:

        • a. Do not cancel anything else.

        • b. Cancel both A and the rest of B[y].

        • c. Cancel all other B[y] algorithms, and leave A intact.

        • d. Cancel A and leave all other B[y] intact.

TBB cancellation model supports all these behaviors except 2c and 2d. We left out 2c and 2d because on one hand we did not and still do not know any real use case where such a behavior would be necessary. On he other hand supporting them would have complicated both usage model and implementation, and most probably negatively impacted the performance of the solution.

These basic situations (reactions to cancelling an instance of either A or B algorithm) can be combined in the following scenarios of nested algorithms hierarchy behavior (I use the qualifiers parent and descendant for the nesting and nested algorithms correspondingly):

I. Cancelling parent does not affect its descendants (1a). Cancelling a descendant does not affect its parent and siblings (2a).
II. Cancelling parent cancels its descendants too (1b). Cancelling a descendant does not affect its parent and siblings (2a).
III. Cancelling parent cancels its descendants too (1b). Cancelling a descendant cancels its parent and all its siblings (2b).

Combination 1a + 2b is not supported because of the same reasons as basic situations 2c and 2d.

In order to support these scenarios TBB introduces 2 flavors of task group contexts and allows sharing contexts between algorithms.

When nested algorithms share their parent’s context, we naturally get scenario III. This is the most efficient mode of building the nested hierarchy from the performance standpoint. This is also the default behavior of the code that explicitly uses TBB tasks but do not use contexts (for example existing code based on TBB 2.0).

When instances of parent and descendant algorithms have different contexts (never mind their flavors) we get the situation 2a. To implement scenarios I and II we need to discriminate situations 1a and 1b while supporting 2a. Here is where context flavors come into play. Since we need to differentiate two situations only, two context flavors are defined (flavors are called “kinds” in the TBB sources and TBB Doxygen documentation)

    • bound

    • isolated

In terms of our simple example the flavor of the parent s context is irrelevant to our use cases. What is important is the flavor of the descendant context. If it is isolated, then we have scenario I. If the flavor of the descendant context is bound, then we have scenario II.

In other words, bound contexts form a tree. When a context in such tree is cancelled, all its descendant contexts (in the subtree rooted at the cancelled context) are cancelled too. Isolated contexts serve as roots of the context trees (that is descendant bound contexts can be tied to isolated ones). Sharing a context object between algorithms results in one big cancellation group.

Having already mentioned a few unsupported use cases, I would like to complete the use cases analysis by describing usages that are technically possible but result in undefined behavior. Unfortunately TBB does not have means to detect them in runtime (doing this would have substantially twisted the control flow of the debug build unacceptably deviating it from that of the release build). There are two main use cases of this sort (that is bad ones):

    • Using the same context object to start several outermost root tasks (algorithms) in different masters. Look what happens when a task tree growing in one of such masters is collapsed. Since the context object is shared by other task trees (having different roots) those other trees will be collapsed too. But in this case, when one of the trees finishes collapsing, it is impossible to say whether other trees has already been collapsed. Thus this context object can never be reused, and it is never safe to destroy it.

    • Similar problem will take place when an outer algorithm running in the context C1 creates context object C2 and starts several nested algorithms in C2.

And, finally, consider the following nested hierarchy:

    Algorith1 with Context1 ->
Algorithm2 with Context2 ->
Algorithm3 with Context1 ->
Algorithm4 with Context2

Such a hierarchy is very inefficient from the cancellation performance standpoint. For example, suppose that Algorithm4 is cancelled. Tasks of Algorithm 2 running in the same Context2 will start being collapsed too, but their descendants running in the Context1 (tasks belonging to the Algorithm3) will continue running until full completion, doing meaningless work. Besides these running Algorithm3 tasks will continue spawning new Algorithm4 instances, which will never have a chance to be executed, but which tasks may still be stolen by other threads, what obviously has a negative impact on the overall performance.

Fortunately all these bad use cases do not seem to be natural ones (that is ones you might want to apply intentionally). But nevertheless they may be a result of an unintentional error, so it may turn out useful to be aware of their negative consequences.

Last time I promised to describe how to create context objects and associate them with tasks and algorithms, but I’m sorry, it looks like I’ll have to postpone it until the next post. See you in Intel Software Network!

For more complete information about compiler optimizations, see our Optimization Notice.