Transitioning from Intel® TBB 2.1 to 2.2

When trying out the new and improved features of Intel® TBB 2.2, be aware that some of the changes in Intel® TBB 2.2 could impact your existing Intel® TBB code. Here’s a handy guide to making the transition to version 2.2 quickly and painlessly.

Concurrent queue API changes
The new concurrent queue implementation comes in two flavors now: the unbounded concurrent_queue and the concurrent_bounded_queue. Use the unbounded form if you need only basic non-blocking push/pop operations to modify the queue. Otherwise use the bounded form which supports both blocking and non-blocking push/pop operations.

The new API changes four specific method names:
1. pop_if_present becomes try_pop
2. push_if_not_full becomes try_push in concurrent_bounded_queue (it doesn’t exist for concurrent_queue)
3. begin becomes unsafe_begin
4. end becomes unsafe_end

You should transition to the new API as soon as possible. You can still use the old API by compiling with the pre-processor symbol TBB_DEPRECATED=1.

Concurrent vector API changes
Three functions in the concurrent_vector implementation now have a different return type:
1. grow_by used to return size_type but now returns iterator
2. grow_to_at_least used to return void but now returns iterator
3. push_back used to return size_type but now returns iterator

The example below shows how this applies to the common use case of appending a sequence to a concurrent_vector. With Intel® TBB 2.1, one might write the following function:

template
void Append(concurrent_vector& x, const T* begin, const T* end) {
std::copy(begin, end, x.begin()+x.grow_by(end-begin));
}

Intel® TBB 2.2 makes the body of that function a little simpler:
std::copy(begin, end, x.grow_by(end-begin));

One function of concurrent_vector has been renamed without any impact on its semantics:
1. compact() has been replaced by shrink_to_fit()

Again, the use of TBB_DEPRECATED=1 will restore the 2.1 behavior.

Task API changes
The notion of task depth has been eliminated in Intel® TBB 2.2. Four members of class task are affected: the type depth_type and the methods depth, set_depth and add_to_depth. These members may still be used, but they have no effect.

Default partitioner change
The default partitioner for loop templates is now auto_partitioner(). Previously, simple_partitioner() was the default. If you experience a performance improvement or detect no performance change in your code you may wish to do nothing, letting your code now use the auto_partitioner. If, however, you experience a performance degradation, you may want to modify your code. To restore identical performance behavior to your loops, you will need to specify the simple_partitioner explicitly.

For example, here is a 2.1 code that was using the simple_partitioner by default:

parallel_for(
blocked_range(starty, stopy, grainsize),
ParallelLoopBody()
);

For 2.2, this would now be written:

parallel_for(
blocked_range(starty, stopy, grainsize),
ParallelLoopBody(),
simple_partitioner()
);

You can also compile with TBB_DEPRECATED=1 to use the old default.

Please ask any questions you may have about Intel® TBB 2.2 on the forum. Good luck with the transition!

For more complete information about compiler optimizations, see our Optimization Notice.

Comments

PPL's concurrent_queue has no blocking pop, hence neither does tbb::strict_ppl::concurrent_queue. The blocking pop is available in tbb::concurrent_bounded_queue.

The design argument for omitting blocking pop is that in many cases, the synchronization for blocking is provided outside of the queue, in which case the implementation of blocking inside the queue becomes unnecessary overhead. On the other hand, the blocking pop of the old tbb::concurrent_queue was popular among users who did not have outside synchronization. So we split the functionality. Use cases that do not need blocking or boundedness can use the new tbb::concurrent_queue, and use cases that do need it can use tbb::concurrent_bounded_queue.


Hi!

What happened to tbb::strict_ppl::concurrent_queue<T>::pop(T &) ?? There seems to be no blocking pop method any longer.

Thanks,

Shawn