4,391 Posts served
10,712 Conversations started
- Academic

- Android

- Art, Music, & Animation

- Embedded Computing

- Events

- Game Development

- Graphics & Media

- Intel SW Partner Program

- Intel® AppUp Developer Program

- Manageability & Security

- Mobility

- Open Source

- Parallel Programming

- Performance and Optimization

- Power Efficiency

- Site News & Announcements

- Software Tools

- Association for Computing Machinery TechNews (ACM)
- Go Parallel! (Dr. Dobbs)
- HPCwire (Tabor Communications, Inc.)
- insideHPC (John West)
- Joe Duffy's Weblog (Microsoft)
- Microsoft Parallel Programming Development Center (Microsoft Germany)
- MultiCoreInfo.com
- scalability.org (Scalable Informatics)
- Software Dev Blog (Intel Germany)
- Soft Talk Blog (Intel United Kingdom)
- The Moth (Microsoft)
TBB: Beyond Do Loops
By Arch Robison (Intel) (30 posts) on December 18, 2006 at 11:20 pm
Clay's blog http://softwarecommunity.intel.com/ISN/Community/en-us/blogs/multi-core-thredmonkey/archive/2006/12/18/30228042.aspx asks if Intel® Threading Building Blocks [Intel® TBB] is a solution looking for a problem. OpenMP is great if you have Fortran code, or C code that looks like Fortran, or C++ that looks like Fortran. In other words, flat do-loop centric parallelism. With TBB, we're trying to go beyond that and enable generic parallel programming.
For example, TBB provides a parallel sort. If you look at the implementation in include/tbb/parallel_sort.h, you'll see that it's a parallel quicksort implemented using parallel_for, without any explicit recursion. We can do that because TBB let's you define your own iteration spaces. You just have to specify signatures for:
- Is the space empty?
- Should the iteration space be split?
- If it should be split, how to split it.
TBB provides one and two dimensional spaces. These work for signed and unsigned integral types and pointer types. In contrast, an OpenMP parallel for loop is restricted to a signed integral type, and only one dimension.
The TBB parallel_reduce is another good example of being generic. Consider the simple problem of finding the index of the minimum value in an array. OpenMP allows reductions only on built-in types. But to solve this problem, you need a reduction over a pair type (value,index). The TBB parallel_reduce works generically on any type. You just have to provide a few signatures. Section 3.3.1 of the TBB Tutorial explains the "index of minimum" solution in detail.
Clay finds the primes example in TBB a bit long. That's because it's a reasonably good algorithm (e.g. it's blocked for cache and does dynamic memory allocation). It's not the ridiculously slow "try all divisors" algorithm sometimes found in some "how to thread" articles.
If OpenMP fits, use it. If it doesn't, consider TBB as the next step.
Categories: Parallel Programming, Software Tools
Tags: TBB
For more complete information about compiler optimizations, see our Optimization Notice.
Comments (3)
| January 2, 2007 4:10 PM PST
Arch Robison (Intel) | Thanks for noticing this. The blogging software is unreliable and loses my posts with frequently. So I usually copy-and-past the content into a Word document, and copy it back if the blogging software loses the post. I suspect the fixed-width cases may be those that I cut-and-paste from the Word document. |
| February 5, 2007 7:06 PM PST
Robyn Tippins | That's probably exactly the problem. Word wreacks havoc on any blogging platform. Clearing out the code with a copy/paste into notepad before pasting into the editor has always been my non-technical way to prevent this annoyance. |


Timmie Smith
This post and the one about supporting the double-check pattern seem to be formatted to a fixed width. The text runs under the apps on the right side of the screen. Not sure if it is a mozilla bug or something different in how you composed these two posts. The rest of them look fine.
Welcome (belatedly) to the blogging world.