Task and task group priorities in TBB

What I like most about Intel® Threading Building Blocks (TBB) library is its incessant evolution. Having been first released almost five years ago and enjoying quite broad adoption in the software development industry since then, it still keeps growing new features at unabated pace. Sustaining this trend and advancing TBB’s quest for ever better composability, just published Update 6 to TBB 3.0 introduces a new version of TBB task scheduler that allows assigning priorities to tasks and task groups. Thus today we’ll talk at length about this new functionality.

Before moving to specifics of TBB’s task priorities implementation (fortunately it is pretty straightforward for the most part, and even though there are a few niceties and caveats, you are not running the risk of drowning in them) let’s start with an overview of what kinds of priority exist and what they are used for.

First of all priorities can be dynamic or static. The former allows changing entity’s priority unlimited number of times, even after the entity have been included into the schedule. Static priority means that after an entity has been scheduled, its priority remains fixed. An immediate consequence of this differentiation is that support of dynamic priorities provides more flexibility to developers, but at the cost of more complex scheduling logic and larger overhead than those in case of limiting priorities to the static scheme only.

Another attribute of priorities implementation is how priority values are represented and how they are compared. Three primary cases exist:

    • Integer numbers that are compared as integers. Further variants are possible in practice:

        • Small number of fixed values.

        • Large range of values (there is only a finite number of values in any integer range).

        • Fixed size small set of dynamically changing values (e.g. when each next frame or in decoding pipeline is assigned priority lower than its predecessors, but there cannot be more than K frames in flight simultaneously).


    • Floating point numbers that are compared as floats. In this case a range of values is normally permitted, often as large as all positive real numbers. But even with a shorter range like [0, 1], the number of floating point values inside it can be virtually infinite, limited only by the number of bits used in the floating point value binary representation.

    • Arbitrarily complex data structures or more general data sets that are compared by means of program supplied functors. When priorities of this kind are used, the scheduler should generally assume indefinitely large amount of items with different priorities.



Similarly to dynamic vs. static scheme comparison, these alternatives also entail an inherent tradeoff between simplicity and smaller overhead vs. greater flexibility and expressive power.

Now let’s have a look at the priority scheduling usages that can be found in practice. I’d combine them into the following three major groups:

    1. Scheduling threads in general purpose OSes.

        1. Giving precedence to the work that must be done right now before the processing, results of which will be necessary some time in the future. An important use case from this category is ensuring program responsiveness to external events by starting their handlers at a higher priority level or lowering priority of concurrently running computations.

        1. Promoting work that certainly must be done at the expense of speculative or sacrificeable activities.


      Since we are talking about intra-process concurrency (current TBB domain), prioritizable entities here are threads, their priorities are dynamic, and only a relatively small fixed set of values is supported. For example Windows provides 7 priority levels for threads in the same process, Linux – 40 levels (for user mode non-realtime light weight processes).

    1. Scheduling tasks in real time OSes. Here priorities are usually computed based on task temporal parameters – deadline and estimated execution time.
      Priority values are either real numbers (that is there is virtually infinite amount of them), or integers in a wide enough range (hundreds or more). Most often static priority scheme is used.

    1. Priorities in numerical and data processing algorithms that rely on data items prioritization as part of their main computational logic, such as spatial data bases indexing, discrete events simulation, and AI search-tree problems just to name a few. If data items are represented in the form of tasks, control flow of such algorithms can be transformed into a task scheduling problem.
      Expectedly enough such use cases usually involve infinite range of priority values of any conceivable nature, with comparison operations often specified as functors of various complexity levels.



In conclusion of our short overview of the priority concept here is a table that summarizes characteristics of priority schemes used in major OSes and concurrency frameworks:



























































Framework Type Number of  levels Granularity Starvation resistant
Windows threads Dynamic, preemptive 7 Thread Yes (prio boost)
Pthreads Dynamic, preemptive 40 LWP Yes (prio boost)
GCD Static, non-preemptive 3 Block (lambda), serial queue no?
GCD Suspension,
non-preemptive
n/a Serial queue n/a
Data-parallel: 

OpenCL, CUDA,  ZPL, Floodgate, Intel® ArrBB
- - - -
General purpose native: 

PPL, OpenMP, Intel® Cilk Plus, Go, X10, Fastflow
- - - -
General purpose managed: 

TPL,JCU, Parallel Task
- - - -


 

Notably, none of the user mode concurrency frameworks supports dynamic priorities, and only GCD provides static ones. Microsoft’s TPL seems to have QueuedTaskScheduler that supports queues with different priorities (similar to GCD). But it is not documented officially in MSDN, only mentioned in the ParallelExtensionsExtras Tour blog. ConcRT shipped with Visual Studio 2010 allows specifying priority of a scheduler instance at the moment of its creation, which is in fact priority of threads in the underlying thread pool and can be used only for static prioritizing at the task level.

If after reading to this point you’ve become feeling a bit uneasy about conglomeration of different scenarios involving priorities (at least I was), you may relax, as TBB in no way tries to cram it all into itself (and then dump it on the heads of its users). Instead based on our customer requests we target scenarios mostly from the first group, and even though our solution is not completely identical to the approach used for prioritizing OS threads by either Windows or Linux, much similarity exists on the usage model level.

Priority related additions to TBB API are pretty compact:


namespace tbb {

enum priority_t {
priority_normal = /* implementation specific */,
priority_low = /* implementation specific */,
priority_high = /* implementation specific */
};

class task_group_context /* ... */ {
// . . .
public:
void set_priority ( priority_t p );
priority_t priority () const;
};

class task /* ... */ {
// . . .
public:
static void enqueue ( task& t, priority_t );
void set_group_priority ( priority_t );
priority_t group_priority () const;
};

} // namespace tbb


There are 3 priority levels: “low”, “normal”, and “high”. Though at first glance it may appear to be too few, actually “normal” + “high” and “low” + “normal” levels cover all the primary use cases involving precedence and responsiveness from both functionality and expressiveness standpoint. Of course, if convincing scenarios requiring more levels ever appear, this set can be extended in the future.

Interestingly, having only 3 priority levels has a somewhat unexpected property of alleviating composability problem inherent to priority schemes. The issue with multiple priority levels is that assumptions of one component may be overridden by another concurrent activity. E.g. in say 5-level setup component A may set its task group at priority “above normal” and expect it to proceed (while reserving level “highest” for a possible higher priority event handler). However if around the same time component B starts an algorithm at the “highest” level, then A’s “above normal” task group will be stalled (unexpectedly to A). In 3-level scheme both A and B would have to use the same level “high”, and A’s computation would never be preempted.

TBB supports both static and dynamic priorities, but they apply to different kinds of task. The former are used with tasks that are scheduled via task::enqueue() method, and are specified via its additional argument. Original form of  task::enqueue() is preserved and schedules the task at the normal priority level.

Dynamic priorities apply to task groups only and can be specified and changed later either directly via new task_group_context::set_priority() method or indirectly via task::set_group_priority(). The latter obviously affects the whole group, which the given task belongs to.

If you wonder why we introduce such differentiation and why dynamic priorities are not applicable on per-task basis, there are several factors that influenced our design decision.

Most parallel algorithms generate large amount of tasks, while most usage models involving dynamic priorities need to change the priority of the whole computation. In case of per-task granularity this would require means to track all the individual tasks constituting the computation. Required relation information is not (and unlikely ever be) maintained by the library because of exorbitant impact on the performance and scalability.

Dynamic priorities imply an entity that can be used to reference the target of priority setting. This conclusively precludes task object from carrying priority attribute, as normally TBB task is automatically destroyed after its execution (if only not explicitly recycled during its run), and thus accessing it after it has been spawned or enqueued is unsafe.

Similarly to cancelation TBB cannot preempt a running task, which narrows the effect area of task level dynamic priorities to only in between of allocation point and task execution beginning. Besides with TBB tasks being mostly short enough (tens of microseconds) re-prioritizing already running task makes little sense.

On the other hand task group context object is guaranteed (in a well-formed program) to encompass lifetime of the task tree associated with it. Besides it is already used for cancelation and provides API for grouping tasks and specifying relations between different task groups, as well as internal infrastructure that can be reused for priority change propagation between nested task groups. At last considering task group as a unit of prioritization allows us to naturally implement preemption (again, similarly to cancellation). All this makes task group context an ideal choice for carrying dynamic priority attribute.

However in case of recently introduced FIFO tasks tasks scheduled via task::enqueue() primary usage models imply operating with individual task objects, often in fire-and-forget manner (in contrast to fork-join style task batches used to implement computational algorithms). This suggests that priorities should be supported on the individual basis as well. And static priorities seem to be a good fit here. In a Three Layer Cake setup where FIFO tasks are used to build the top application logical layer. Their function is either to execute some relatively short code (probably enqueuing other tasks along the way) or to spawn a traditional fork-join parallel workload. The latter can be associated with an explicit task group context that will be used for changing priority of a (long) running traditional task hierarchy spawned from inside the FIFO task.

When a task with a new priority appears or priority of a task group changes, execution of tasks in other threads evidently should be affected. In TBB there are two scopes of priority change effect:

    • Local to a master thread.

    • Global across all master threads in the process.



(Let me remind here that master thread is any thread created by the application that starts TBB parallel work. Threads supplied by TBB are called worker threads or simply workers.)

The local scope corresponds to an internal arena object. It comprises all the tasks originated by the given master thread, and includes both the master itself and workers servicing it at the moment. When a task or a task group with higher priority appears, execution of all the tasks with lower priority in this arena is postponed until the higher priority ones are processed.

The global scope includes only worker threads. When priority of a task in one of the arenas (remember that each master thread has an arena of its own) is increased above that of the highest priority task in other arenas, the scheduler signals these lower priority arenas to relinquish their worker threads so that they could move into higher priority ones.

More generally, the scheduler groups arenas by their priority level (that is in accordance with the highest priority of a task in an arena), and first assigns workers to the highest priority arenas. Only if there are some workers left, they are assigned to lower priority arenas.

Another point to note is that master threads are never stalled due to priority changes. Even when all workers are revoked from a lower priority arena, its master thread continues executing its tasks. Such design has a purpose to avoid unexpected behavior of isolated components that may rely on constant progress ensured by at least single threaded execution (as it was in pre-priority-enabled TBB schedulers).

Now that we’ve discussed basic design elements of this new feature, let’s shortly talk about its performance aspects. As the above considerations imply we’ve tried hard to avoid adding features with most significant negative performance impacting, such as unlimited number of priority levels or priority values of generic nature. However nothing is really free in this life, and even the simply looking priority facilities we now provide may entail noticeable overhead when used.

The words “when used” are the key ones in the previous sentence. We’ve managed to move most of the priority related book-keeping operations from the scheduler hot paths. Thus applications not using priorities should not experience any noticeable changes in their performance and scalability. The same should be true when priority changes happen infrequently, that is on the scale of milliseconds. This seems to be quite a reasonable condition for applications that will use priorities in response to user actions or changes in external environment, or to (re)prioritize processing of large lumps of work. That is in a way we envisioned it will be used.

The last thing to mention in this overview of the new task and task group priority support is that due its intrusiveness (its implementation involved changes in multiple parts of TBB task scheduler) it is released as a Community Preview feature. Thus to have it enabled you need to define the following macro before including task.h header:


#define TBB_PREVIEW_TASK_PRIORITY 1
#include “tbb/task.h”


and link with the special version of the TBB library:




























OS Release build Debug build
Windows tbb_preview.{lib,dll} tbb_preview_debug.{lib,dll}
Linux libtbb_preview.so.2 libtbb_preview_debug.so.2
OS X libtbb_preview.dylib libtbb_preview_debug.dylib
Sun OS libtbb_preview.so libtbb_preview_debug.so


 

If you build TBB from sources, you’ll need to specify “tbb_cpf=1” on make command line.

And of course another purpose of releasing a feature as Community Preview is to get early feedback from its users. So if you tried to use task priorities and found that our design does not completely suits your needs or resulting performance is below your expectations, don’t hesitate contacting us on the official TBB forum.

 

Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.