The user-driven evolution of tbb::pipeline

Pipelining is a common and widely used pattern to have some job done by few workers, i.e. in parallel. Threading Building Blocks cannot miss this pattern, and we provided tbb::pipeline since the first version. You might learn some basics about the algorithm and its use in TBB documents and in Robert Reed’sOverlapping I/O and processing in a pipeline” blogging series. Here I will tell you what has changed in the pipeline implementation since TBB 2.0.

Opening TBB 2.0 code and providing it as free software gave us an excellent source of new ideas for improvements and enhancements in the library, in addition to the feedback from customers. The pipeline implementation also got a few. All the proposals were reviewed by the team; and the following were implemented:

Fixed idle spinning of worker threads in case of blocked input
Threads spinning idle always were the area of concern for TBB customers; I can recall a few places where we had to minimize idle spinning. The issue in pipeline was missed for some time however, because we did not expect that a thread may block inside input stage. It was obviously an oversight; pipeline users wanted it to read the data coming dynamically at runtime, e.g. from network, therefore waiting for data inside the input filter is very possible. The fix for the issue required significant overhaul of the implementation; however it was so important that we released it as part of TBB 2.0 Update 3. And the new implementation allowed the next improvement.

Pipeline input stage allowed to be parallel
In TBB 2.0 and before, the pipeline input stage was always serialized, even if the corresponding filter was constructed as parallel. After the aforementioned fix, it became possible to allow input stages run in parallel; and we implemented this improvement.
Be aware that if you used a “parallel” input stage with TBB 2.0, switching to TBB 2.1 may reveal some hidden pitfalls. In order to work correctly, the parallel input filter should conform to some non-formalized requirements:
- it should be thread-safe, as it will be called by different threads simultaneously;
- it should be tolerant to latecomers, because it can be called even after one of the previous calls returned NULL;
- it should better not block, otherwise every thread could block there.
Interestingly, this change has some affect on serial stages. As you might know, serial stages of TBB pipeline are in fact strictly ordered. In TBB 2.0, the order was defined exactly by the input stage; and all subsequent serial stages processed items (tokens) in this order. But now, the input stage can be parallel; in this case, the token order is defined by the first serial stage encountered in the pipeline. It happened to be easier than thread-safe token number assignment for parallel input stage; and it is better this way, because the first serial stage runs as "first come first served" that might improve total throughput.

Enumeration is used to distinguish filter types
It’s an example of community influence on TBB: we were requested at the forum to change Boolean parameter to an enumeration for filter type specification. Even if we only have serial vs. parallel filters, using enumeration improves code readability and allows compile-time check of the argument. So now you could write MyFilter::MyFilter() : tbb::filter(serial) {} instead of MyFilter::MyFilter() : tbb::filter(/*is_serial*/ true) {}. It really is easier to read, isn’t it? Thanks to the community for pointing the developers to these small but handy improvements.

Automatic removal of filters
This improvement in how the pipeline object should be destroyed was the last of the changes accepted for the next version of TBB. In TBB 2.0, one needs to call clear() for the pipeline prior to filter deletion; it is necessary for safe pipeline destruction, and it is documented. But why can’t filter destructor remove the filter from its pipeline? Well, now it can. But doing this in the way that is compatible with the previous implementation was somewhat tricky. We needed to change the single-linked list of filters to become double-linked, and add a pointer to the pipeline. Thus the layout of the filter class was changed, but user-defined filters inherit this class – so how we keep it backward compatible? Check the code to find the answer. :)

You might check the improved pipeline functionality with all the above features in the recent open-source stable release (at the moment of writing, it was tbb21_20080605oss).

There also were a couple of internal proposals we declined, and there remains a list of things still to be done or under consideration, such as:
- We added general support for cancellation and handling of exceptions to TBB (see Andrey Marochko’s blog for interesting details); now the pipeline should be improved to work correctly in case of cancellation.
- Type-safe filters were also proposed at the forum together with the implementation. We like the idea of compile-time type checks and operator>> semantics to build a pipeline, and would be glad to elaborate on it and improve the implementation; but unfortunately the author did not contribute it officially to the project.
- Out-of-order serial filters were considered and postponed because we did not see a really good motivation case. A case was recently proposed at the forum where unordered serial filter might improve output latency.
- The idea of automatic selection of the maximal number of tokens in flight was proposed, but it’s hard to find heuristics that would work well enough for every use case.
- And some others.
So we are not yet at the end of tbb::pipeline evolution.

And by the way, have you noticed how many times I mentioned the TBB community and the forum? If you have a great idea for TBB, you are very welcome with it. We listen to you; leaving a comment in this blog is a simple way to make your contribution. :)
For more complete information about compiler optimizations, see our Optimization Notice.