Tasks or Software Threads?

Most modern threading platforms are already offering task based programming models. Thus, they are allowing developers to follow one of the eight rules for multicore programming written by James Reinders a few years ago. I’m specifically talking about rule #3: "Program in tasks (chores), not threads (cores)."

James suggests that you should leave the mapping of tasks to hardware threads as a distinctly separate operation in your code. When you create tasks using an efficient task based programming model, you can create as many as you can without worrying about oversubscription. Of course, you still have to pay attention to the introduced overheads. In fact, tasks also introduce an overhead and it is always important to measure speedups.

Tasks consume software threads using many different techniques to reduce the overhead needed to schedule work and they take advantage of the underlying hardware threads (logical cores). When you work with tasks the code is easier to read than its pure thread version. One of the key advantages of tasks is that they usually require less overhead for their creation than threads. This way, some algorithms that are simple to implement using dozens of tasks reduce their overhead compared to its implementation using dozens of threads. Again, it is also important to consider that this doesn’t mean that you have to add tasks all the time. They have to be used in a smart way.

I’ve written a few posts about the new task based programming model in C# 4 with .NET 4. You can read about the specific implementation of tasks in Visual Studio 2010 in my post "Tasks Are Not Threads". I wrote it when Visual Studio 2010 was in Beta 1. Now, it is available its Release Candidate version but the concepts explained in this post are still valid.

Intel® Threading Building Blocks (Intel® TBB), Intel® Cilk++, OpenMP and QuickThread include task based programming models. It is very important to learn their possibilities in order to express parallelism at a much finer granularity. Then, you can decide whether your algorithm would run better by using tasks or threads.

You can read the eight rules in the article published by James Reinders in Dr. Dobb’s: "Rules for Parallel Programming for Multicore".

For more complete information about compiler optimizations, see our Optimization Notice.

2 comments

Top
anonymous's picture

Gastón,

Many thanks to you and your Intel colleagues for promoting and supporting tasks in parallel programming. The success of tasks for some applications raises the hope that the task concepts can be generalized and that the tool support can be extended to thus bring the success of tasks to more applications. An example follows. A task is an encapsulated piece of application execution. A procedure is an encapsulated piece of application code. A procedure call can be implemented as a task. For some applications, this might help the overhead and granularity issues you mention. The following presentation introduces this and other examples trying to extend the success of tasks to other applications. http://www.cecam.org/workshop-4-306.html?presentation_id=3627

Regard,
Burkhard

Michael K. (Intel)'s picture

Hi!

While OpenMP (with version 3.0) indeed offers task-based programming, all other constructs of OpenMP follow the same rules.

OpenMP's worksharing construct only defines how the iteration space of a loop is cut into what OpenMP calls "chunks". One can view these chunks as a task to compute a portion of the loop's iteration space. This view is not limited to dynamic scheduling but also extends to static scheduling and other scheduling kinds.

A similar argument applies to OpenMP's "sections" construct. In my view, the "task" construct is the dynamic extension of the static "sections" construct. Again, OpenMP does not prescribe a section-to-thread mapping, but generically mentions tasks that assigned to threads.

Although the Open spec frequently elaborates on "threads" as the entities that execute an OpenMP region, there is not necessarily a 1:1 mapping of OpenMP threads to OS threads. A conforming implementation might schedule n OpenMP threads on m OS threads (e.g. through fibers).

Cheers,
-michael

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.