Cilk++ vs. Pthreads

Cilk++ vs. Pthreads

In a short time I will be demonstarting the Cilk++ software platform to my colleagues. I am certain that one of the questions will be: what about pthreads?

I know very little about Posix threads or pthreads, except that they have existed for some time and are used in multi-threadedc/c++ programs on Linux and Windows platforms (and more).

They are an existing way to do multithreaded programs.

From whatI can see, Cilk++ provides a much simpler way to paralellize legacy c or c++ code.The overhead is lower ina Cilk++parallel process,and Cilk++ has superior alternative ways to handle (and detect) data races.

Any thoughts or this will be greatly appreciated, since I must be ready for such a question. It is inevitable.


2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Multithreading and parallelism are not necessarily the same thing. A program can be multithreaded for many different reasons, even on a single core. For example, it is common to have a UI thread that is responsive to the user while a computation thread runs in the background. Some programs are best structured as producer threads and consumer threads. It is also sometimes desirable to have one thread per session or user, as in a web server. None of these threading applications are about getting the most out of a multicore processor. In all of the preceding applications, the threads are created and destroyed infrequently and communicate through some kind of message protocol. All of them are perfectly good applications for pthreads and not for Cilk.

Conversely, parallel programming is about marshalling the processing resources of multiple cores to do a single job. In this case, you want serial semantics with automatic load-balancing among availalble resources. The prallelism expressed in the code must be composable, so calliing a parallel subroutine from another parallel subroutine will not oversubscribe your machine. It should be practical to express much more parallelism in your program than there are cores to execute it. That way, as you scale up to more cores, the program will automatically take advantage of the additional resources. This is where pthreads fails us and where Cilk comes in. One could say that cost of starting a thread using pthreads is hundreds of times the cost of a cilk_spawn and that starting too many threads will oversubscribe the machine and create overhead, but that would be missing the key point. The key point is that the manual creation and destruction of threads and the manual allocation of work to threads does not preserve serial semantics and does not easily balance the load -- it is just not the right paradigm for getting the job done.

A Cilk program looks like a serial program. A pthreads program looks like a multithreaded program. The distinction is much more than skin deap. A pthreads program will create multiple threads even if there is only one core, whereas a Cilk program will not. That is because "threads" are not a core concept in Cilk. Rather, in a Cilk program, you describe potential parallelism and the Cilk runtime scheduler uses threads "under the cover" to distribute the work among the parallel tasks.

Can you use pthreads to build a load-balanced parallel framework? Of course. It's called Cilk, or TBB. You can build it yourself, in about 5 years, but you will probably end up with something inferior.

In summary: Use pthreads when you need long-lived threads where you care about the scheduling: i.e., where you want to makes sure that multiple things all make progress even if it slows some of them down. Use a parallel language like Cilk when you don't care about the order in which the tasks complete, but where you want to maximize the user of processing resources to get the job done. The choice is not mutually-exclusive -- you can use Cilk in your computation thread and still create a pthread to manage your UI, for example.


Leave a Comment

Please sign in to add a comment. Not a member? Join today