What's New? Intel® Threading Building Blocks 4.2

One of the best known C++ threading libraries Intel® Threading Building Blocks (Intel® TBB) was recently updated to a new release 4.2. The updated version contains several key new features comparing to previous release 4.1. Some of them were already released in Intel TBB 4.1 updates.

The new synchronization primitive speculative_spin_mutex introduces support for speculative locking. This has become possible using the Intel® Transactional Synchronization Extensions (Intel® TSX) hardware feature available in 4th generation Intel® Core™ processors. On processors that support hardware transactional memory (like Intel® TSX), speculative mutexes work by letting multiple threads acquire the same lock, as long as there are no "conflicts" that may generate different results than non-speculative locking. So no serialization happens in non-contended cases. This may significantly improve performance and scalability for “short” critical sections. If there is no hardware support for transactional synchronization, speculative mutexes behave like their non-speculating counterparts, but possibly with worse performance.

Intel TBB now supports the exact exception propagation feature (based on C++11 exception_ptr). With exception_ptr, exception objects can be safely copied between threads. This brings flexibility in exception handling in a multithreaded environment. Now exact exception propagation is available in prebuilt binaries for all platforms: OS X*, Windows* and Linux*. On OS X there are two sets of binaries: the first is linked with the gcc standard library – it is used by default and doesn’t support exact exception propagation. To use the feature, you should take the second set of binaries linked with libc++, the C++ standard library in Clang. To use these, set up the Intel TBB environment and build your application in the following way:

     # tbbvars.sh libc++
     # clang++ -stdlib=libc++ -std=c++11 concurrent_code.cpp -ltbb

In addition to concurrent_unordered_set and concurrent_unordered_map containers, we now provide concurrent_unordered_multiset and concurrent_unordered_multimap based on Microsoft* PPL prototype. concurrent_unordered_multiset provides the ability to insert an item more than once, which is not possible in concurrent_unordered_set. Similarly, concurrent_unordered_multimap allows inserting more than one <key,value> pair with the same key value. For both “multi” containers find will return the first item (or <key,value> pair ) in the table with a matching search key.

Intel TBB containers can now be conveniently initialized with value lists as specified by C++ 11 (initializer lists):

tbb::concurrent_vector<int> v ({1,2,3,4,5} );

Currently initialize lists are supported by the following containers:

concurrent_vector
concurrent_hash_map
concurrent_unordered_set
concurrent_unordered_multiset
concurrent_unordered_map
concurrent_unordered_multimap
concurrent_priority_queue

The scalable memory allocator has caches for allocated memory in each thread. This is done for the sake of performance, but often at the cost of increased memory usage. Although the memory allocator tries hard to avoid excessive memory usage, for complex cases Intel TBB 4.2 gives more control to the programmer: it is now possible to reduce memory consumption by cleaning thread caches with the scalable_allocation_command() function. There were also several improvements in overall allocator performance.

Intel TBB library is widely used on different platforms. Mobile developers can now find prebuilt binary files for Android in the Linux OS package. Binary files for Windows Store applications were added to the Windows OS package.

Atomic variables tbb::atomic<T> now have constructors when used in C++11. This allows programmers to value-initialize them on declaration, with const expressions properly supported. Currently this works for gcc and Clang compilers:

tbb::atomic<int> v=5;

The new community preview feature allows waiting until all worker threads terminate. This may be needed if an application forks processes, or if the Intel TBB dynamic library can be unloaded at runtime (e.g. if Intel TBB is a part of a plugin). To enable waiting for workers, initialize the task_scheduler_init object this way:

#define TBB_PREVIEW_WAITING_FOR_WORKERS 1
tbb::task_scheduler_init scheduler_obj (threads, 0, /*wait_workers=*/true);

Find the new Intel TBB 4.2 at commercial and open source sites. Download and enjoy the new functionality!

Related Links and Resources

To learn more about Intel tools for the Android developer, visit Intel® Developer Zone for Android.

For more complete information about compiler optimizations, see our Optimization Notice.
Tags: