Intel® Threading Building Blocks Release Notes and New Features

This page provides the current Release Notes for Threading Building Blocks. The notes are categorized by year, from newest to oldest, with individual releases listed within each year.

Click a version to expand it into a summary of new features and changes in that version since the last release, and access the download buttons for the detailed release notes, which include important information, such as pre-requisites, software compatibility, installation instructions, and known issues.

You can copy a link to a specific version's section by clicking the chain icon next to its name.

To get product updates, log in to the Intel® Software Development Products Registration Center.
For questions or technical support, visit Intel® Software Developer Support.

2019

Update 8

Release Notes

Bug fixed:

Downloads

Update 7

Release Notes

What’s New in this release:

  • Added TBBMALLOC_SET_HUGE_SIZE_THRESHOLD parameter to set the lower bound for allocations that are not released back to OS unless a cleanup is explicitly requested.
  • Added zip_iterator::base() method to get the tuple of underlying iterators.
  • Improved async_node to never block a thread that sends a message through its gateway.
  • Extended decrement port of the tbb::flow::limiter_node to accept messages of integral types.
  • Added support of Windows* to the CMake module TBBInstallConfig.
  • Added packaging of CMake configuration files to TBB packages built using build/build.py script ( https://github.com/intel/tbb/issues/141).

Changes affecting backward compatibility:

  • Removed the number_of_decrement_predecessors parameter from the constructor of flow::limiter_node. To allow its usage, set TBB_DEPRECATED_LIMITER_NODE_CONSTRUCTOR macro to 1.

Preview Features:

  • Added ordered associative containers: concurrent_{map,multimap,set,multiset} (requires C++11).

Open-source contributions integrated:

Downloads

Update 6

Release Notes

What’s New in this release:

  • Added support for Microsoft* Visual Studio* 2019.
  • Added support for enqueuing tbb::task into tbb::task_arena ( https://github.com/01org/tbb/issues/116).
  • Improved support for allocator propagation on concurrent_hash_map assigning and swapping.
  • Improved scalable_allocation_command cleanup operations to release more memory buffered by the calling thread.
  • Separated allocation of small and large objects into distinct memory regions, which helps to reduce excessive memory caching inside the TBB allocator.

Preview Features:

  • Removed template class gfx_factory from the flow graph API.

Downloads

  • TBB 2019 U6 is available as a part of Intel(R) Parallel Studio XE 2019 Update 4.
  • In addition, you can download the latest TBB open source version from https://github.com/01org/tbb/releases.
Update 5

Release Notes

What's New in this release:

  • Associating a task_scheduler_observer with an implicit or explicit task arena is now a fully supported feature.
  • Added a CMake module TBBInstallConfig that allows to generate and install CMake configuration files for TBB packages. Inspired by Hans Johnson (https://github.com/01org/tbb/pull/119).
  • Added node handles, methods merge() and unsafe_extract() to concurrent unordered containers.
  • Added constructors with Compare argument to concurrent_priority_queue (https://github.com/01org/tbb/issues/109).
  • Controlling the stack size of worker threads is now supported for Universal Windows Platform.
  • Improved tbb::zip_iterator to work with algorithms that swap values via iterators.
  • Improved support for user-specified allocators in concurrent_hash_map, including construction of allocator-aware data types.
  • For ReaderWriterMutex types, upgrades and downgrades now succeed if the mutex is already in the requested state. Inspired by Niadb (https://github.com/01org/tbb/pull/122).

Preview Features:

  • The task_scheduler_observer::may_sleep() method has been removed.

Bugs fixed:

  • Fixed the issue with a pipeline parallel filter executing serially if it follows a thread-bound filter.
  • Fixed a performance regression observed when multiple parallel algorithms start simultaneously.

Downloads

Update 4

Release Notes

What's New in this release:

  • global_control class is now a fully supported feature.
  • Added deduction guides for tbb containers: concurrent_hash_map, concurrent_unordered_map, concurrent_unordered_set.
  • Added tbb::scalable_memory_resource function returning std::pmr::memory_resource interface to the TBB memory allocator.
  • Added tbb::cache_aligned_resource class that implements std::pmr::memory_resource with cache alignment and no false sharing.
  • Added rml::pool_msize function returning the usable size of a memory block allocated from a given memory pool.
  • Added default and copy constructors for tbb::counting_iterator and tbb::zip_iterator.
  • Added TBB_malloc_replacement_log function to obtain the status of dynamic memory allocation replacement (Windows* only).
  • CMake configuration file now supports release-only and debug-only configurations (https://github.com/01org/tbb/issues/113).
  • TBBBuild CMake module takes the C++ version from CMAKE_CXX_STANDARD.

Bugs fixed:

  • Fixed compilation for tbb::concurrent_vector when used with std::pmr::polymorphic_allocator.

Open-source contributions integrated:

Downloads

  • Intel TBB 2019 U4 is available as a part of Intel(R) Parallel Studio XE 2019 Update 3.
  • In addition, you can download the latest Intel TBB open source version from https://github.com/01org/tbb/releases.
Update 3

Release Notes

What's New in this release:

  • Added tbb::transform_iterator.
  • Added new Makefile target 'profile' to flow graph examples enabling additional support for Intel® Parallel Studio XE tools.
  • Added TBB_MALLOC_DISABLE_REPLACEMENT environment variable to switch off dynamic memory allocation replacement on Windows*. Inspired by a contribution from Edward Lam.

Preview Features:

  • Extended flow graph API to support relative priorities for functional nodes, specified as an optional parameter to the node constructors.

Open-source contributions integrated:

Downloads

Update 2

Release Notes

What’s New in this release::

  • Threading Building Blocks 2019 Update 2 includes functional and security updates. Users should update to the latest version.
  • Added constructors with HashCompare argument to concurrent_hash_map (https://github.com/01org/tbb/pull/63).
  • Added overloads for parallel_reduce with default partitioner and user-supplied context.
  • Added deduction guides for tbb containers: concurrent_vector, concurrent_queue, concurrent_bounded_queue, concurrent_priority_queue.
  • Reallocation of memory objects >1MB now copies and frees memory if the size is decreased twice or more, trading performance off for reduced memory usage.
  • After a period of sleep, TBB worker threads now prefer returning to their last used task arena.

Bugs fixed:

Update 1

Release Notes

What's New in this release:

  • Doxygen documentation could be built with 'make doxygen' command now.

Changes affecting backward compatibility:

  • Enforced 8 byte alignment for tbb::atomic and tbb::atomic. On IA-32 architecture it may cause layout changes in structures that use these types.

Bugs fixed:

  • Fixed an issue with dynamic memory allocation replacement on Windows* occurred for some versions of ucrtbase.dll.
  • Fixed possible deadlock in tbbmalloc cleanup procedure during process shutdown.
  • Fixed usage of std::uncaught_exception() deprecated in C++17(https://github.com/01org/tbb/issues/67).
  • Fixed a crash when a local observer is activated after an arena observer.
  • Fixed compilation of task_group.h by Visual C++* 15.7 with /permissive- option (https://github.com/01org/tbb/issues/53).
  • Fixed tbb4py to avoid dependency on Intel(R) C++ Compiler shared libraries.
  • Fixed compilation for Anaconda environment with GCC 7.3 and higher.

Downloads

Initial Release

Release Notes

One of the best known C++ threading libraries Threading Building Blocks (TBB) was recently updated to a new release 2019. The updated version contains several key new features when compared to the previous 2018 Update 5 release.

What's New in this release:

  • Lightweight policy for functional nodes in the flow graph is now a fully supported feature.
  • Reservation support in flow::write_once_node and flow::overwrite_node is now a fully supported feature.
  • Support for Flow Graph Analyzer and improvements for Intel(R) VTune(TM) Amplifier become a regular feature enabled by TBB_USE_THREADING_TOOLS macro.
  • Added support for std::new_handler in the replacement functions for global operator new.
  • Added C++14 constructors to concurrent unordered containers.
  • Added tbb::counting_iterator and tbb::zip_iterator.
  • Fixed multiple -Wextra warnings in TBB source files.

Preview Features:

  • Extracting nodes from a flow graph is deprecated and disabled by default. To enable, use TBB_DEPRECATED_FLOW_NODE_EXTRACTION macro.

Changes affecting backward compatibility:

  • Due to internal changes in the flow graph classes, recompilation is recommended for all binaries that use the flow graph.

Open-source contributions integrated:

  • Added support for OpenBSD by Anthony J. Bentley.

2018

Update 6

Release Notes

What’s New in this release:

Bugs fixed:

  • Fixed an issue with dynamic memory allocation replacement on Windows* occurred for some versions of ucrtbase.dll.
Update 5

Release Notes

Changes (w.r.t. Intel TBB 2018 Update 4):

Preview Features:

  • Added user event tracing API for Intel(R) VTune(TM) Amplifier and Flow Graph Analyzer.

Bugs fixed:

Open-source contributions integrated:

Downloads

Update 4

Release Notes

Changes (w.r.t. Intel TBB 2018 Update 3):

Preview Features:

  • Improved support for Flow Graph Analyzer and Intel(R) VTune(TM) Amplifier in the task scheduler and generic parallel algorithms.
  • Default device set for opencl_node now includes all the devices from the first available OpenCL* platform.
  • Added lightweight policy for functional nodes in the flow graph. It indicates that the node body has little work and should, if possible be executed immediately upon receiving a message, avoiding task scheduling overhead.
Update 3

Release Notes

Changes (w.r.t. Intel TBB 2018 Update 2):

Preview Features:

  • Added template class blocked_rangeNd for a generic multi-dimensional range (requires C++11). Inspired by a contribution from Jeff Hammond.

Bugs fixed:

  • Fixed a crash with dynamic memory allocation replacement on Windows* for applications using system() function.
  • Fixed parallel_deterministic_reduce to split range correctly when used with static_partitioner.
  • Fixed a synchronization issue in task_group::run_and_wait() which caused a simultaneous call to task_group::wait() to return prematurely.

Downloads

Update 2

Release Notes

Changes (w.r.t. Intel TBB 2018 Update 1):

  • Added support for Android* NDK r16, macOS* 10.13, Fedora* 26.
  • Binaries for Universal Windows Driver (vc14_uwd) now link with static Microsoft* runtime libraries, and are only available in commercial releases.
  • Extended flow graph documentation with more code samples.

Preview Features:

  • Added a Python* module for multi-processing computations in numeric Python* libraries.

Bugs fixed:

  • Fixed constructors of concurrent_hash_map to be exception-safe.
  • Fixed auto-initialization in the main thread to be cleaned up at shutdown.
  • Fixed a crash when tbbmalloc_proxy is used together with dbghelp.
  • Fixed static_partitioner to assign tasks properly in case of nested parallelism.
Update 1

Release Notes

The updated version (Open Source release only) contains these additions:

  • lambda-friendly overloads for parallel_scan.
  • support of static and simple partitioners in parallel_deterministic_reduce.

We also introduced a few preview features:

  • initial support for Flow Graph Analyzer to do parallel_for.
  • reservation support in overwrite_node and write_once_node.

Bugs fixed

  • Fixed a potential deadlock scenario in the flow graph that affected Intel® TBB 2018 Initial Release.
Initial Release

Release Notes

One of the best known C++ threading libraries Intel® Threading Building Blocks (Intel® TBB) was recently updated to a new release 2018. The updated version contains several key new features when compared to the previous 2017 Update 7 release (https://software.intel.com/en-us/articles/whats-new-intel-threading-building-blocks-2017-update-7).

Licensing

Intel® TBB outbound license for commercial support is Intel Simplified Software License: https://software.intel.com/en-us/license/intel-simplified-software-license. The license for open source distribution has not changed.

Tasks

Intel® TBB is now fully supports this_task_arena::isolate() function. Also, this_task_arena::isolate() function and task_arena::execute() methods were extended to pass on the value returned by the executed functor (this feature requires C++11). The task_arena::enqueue() and task_group::run() methods extended to accept move-only functors.

Flow Graph

A flow graph now spawns all tasks into the same task arena and waiting for graph completion also happens in that arena.

There are some changes affecting backward compatibility:

  • Internal layout changes in some flow graph classes
  • Several undocumented methods are removed from class graph, including set_active() and is_active().
  • Due to incompatible changes, the namespace version is updated for the flow graph; recompilation is recommended for all binaries that use the flow graph classes.

We also introduced a few preview features:

  • opencl_node can be used with any graph object; class opencl_graph is removed.
  • graph::wait_for_all() now automatically waits for all not yet consumed async_msg objects.

Flow Graph Analyzer (FGA) is available as a technology preview in Intel® Parallel Studio XE 2018 and as a feature of Intel® Advisor https://software.intel.com/en-us/articles/getting-started-with-flow-graph-analyzer.The support for FGA tool in async_node, opencl_node and composite_node has been improved.

Introduction of Parallel STL

Parallel STL, an implementation of the C++ standard library algorithms with support for execution policies, has been introduced. Parallel STL relies on Intel® TBB underneath. For more information, see Getting Started with Parallel STL (https://software.intel.com/en-us/get-started-with-pstl).

Additional support for Android*, UWP, macOS
  • Added support for Android* NDK r15, r15b.
  • Added support for Universal Windows Platform.
  • Increased minimally supported version of macOS* (MACOSX_DEPLOYMENT_TARGET) to 10.11.
Bugs fixed
  • Fixed a bug preventing use of streaming_node and opencl_node with Clang; inspired by a contribution from Francisco Facioni.
  • Fixed this_task_arena::isolate() function to work correctly with parallel_invoke and parallel_do algorithms.
  • Fixed a memory leak in composite_node.
  • Fixed an assertion failure in debug tbbmalloc binaries when TBBMALLOC_CLEAN_ALL_BUFFERS is used.
Downloads

You can download the latest Intel® TBB version from http://threadingbuildingblocks.org and https://software.intel.com/en-us/intel-tbb.

In addition, Intel® TBB ca be installed using:

Improved insights in Intel® VTune™ Amplifier 2018

Intel® VTune™ Amplifier 2018 (https://software.intel.com/en-us/vtune-amplifier-help) improved insight into parallelism inefficiencies for applications using Intel® Threading Building Blocks (Intel® TBB) with extended classification of high Overhead and Spin time: https://software.intel.com/en-us/articles/overhead-and-spin-time-issue-in-intel-threading-building-blocks-applications-due-to

Cmake support

Cmake support in Intel® TBB (https://github.com/01org/tbb/tree/tbb_2018/cmake) has been introduced as well.

Samples

All examples for the commercial version of the library were moved online: https://software.intel.com/en-us/product-code-samples. Examples are available as a standalone package or as a part of Intel® Parallel Studio XE or Intel® System Studio Online Samples packages

Documentation

The following documentation for Intel® TBB is available:

2017

Update 8

Release Notes

Bugs fixed
  • The assertion failure has been fixed in debug tbbmalloc binaries (commercial and Open Source releases) when TBBMALLOC_CLEAN_ALL_BUFFERS is used.
Additional features
  • We addressed request from OpenCL team by adding support for more TBB executors than CPU cores.
Update 7

Release Notes

The updated version contains a new bug fix when compared to the previous Intel® Threading Building Blocks (Intel® TBB) 2017 Update 6 release.

Added functionality:
  • In the huge pages mode, the memory allocator now is also able to use transparent huge pages.
Preview Features:
  • Added support for Intel TBB integration into CMake-aware projects, with valuable guidance and feedback provided by Brad King (Kitware).
Bugs fixed:
  • Fixed scalable_allocation_command(TBBMALLOC_CLEAN_ALL_BUFFERS, 0) to process memory left after exited threads.

Intel TBB 2017 Update 7 is open source only release, you can download it from https://github.com/01org/tbb/releases.

Update 6

Release Notes

The updated version contains several bug fixes when compared to the previous Intel® Threading Building Blocks (Intel® TBB) 2017 Update 5 release.

Added functionality:
  • Added support for Android* NDK r14.
Preview Features:
  • Added a blocking terminate extension to the task_scheduler_init class that allows an object to wait for termination of worker threads.
Bugs fixed:

Intel TBB is available to install now in YUM and APT repositories.

In addition, you can download the latest Intel TBB open source version from https://github.com/01org/tbb/releases.

Update 5

Release Notes

The updated version contains several bug fixes when compared to the previous Intel® Threading Building Blocks (Intel® TBB) 2017 Update 4 release.

Added functionality:
  • Added support for Microsoft* Visual Studio* 2017.
  • Added graph/matmult example to demonstrate support for compute offload to Intel(R) Graphics Technology in the flow graph API.
  • The "compiler" build option now allows to specify a full path to the compiler.
Changes affecting backward compatibility:
  • Constructors for many classes, including graph nodes, concurrent containers, thread-local containers, etc., are declared explicit and cannot be used for implicit conversions anymore.
Bugs fixed:
  • Added a workaround for bug 16657 in the GNU C Library (glibc) affecting the debug version of tbb::mutex.
  • Fixed a crash in pool_identify() called for an object allocated in another thread.

Intel TBB 2017 U5 is available as a part of Intel(R) Parallel Studio XE 2018 Beta and is installed with Parallel STL, an implementation of the C++ standard library algorithms with support for execution policies. For more information about Parallel STL, see Getting Started and Release Notes.

In addition, you can download the latest Intel TBB open source version from https://github.com/01org/tbb/releases.

Update 4

Release Notes

The updated version contains several bug fixes when compared to the previous Intel® Threading Building Blocks (Intel® TBB) 2017 Update 3 release.

Added functionality:
  • Added support for C++11 move semantics in parallel_do.
  • Added support for FreeBSD* 11.
Changes affecting backward compatibility:
  • Minimal compiler versions required for support of C++11 move semantics raised to GCC 4.5, VS 2012, and Intel(R) C++ Compiler 14.0.
Bugs fixed:
  • The workaround for crashes in the library compiled with GCC 6 (-flifetime-dse=1) was extended to Windows*.

You can download the latest Intel TBB version from http://threadingbuildingblocks.org and https://software.intel.com/en-us/articles/intel-tbb.

Update 3

Release Notes

Changes since Intel TBB 2017 Update 2:
  • Added support for Android* 7.0 and Android* NDK r13, r13b.
Preview Features:
  • Added template class gfx_factory to the flow graph API. It implements the Factory concept for streaming_node to offload computations to Intel(R) processor graphics.
Bugs fixed:
  • Fixed a possible deadlock caused by missed wakeup signals in task_arena::execute().
Heterogeneous TBB (flow graph promotion):
Update 2

Release Notes

The updated version contains several bug fixes when compared to the previous Intel® Threading Building Blocks (Intel® TBB) 2017 release.

Obsolete
  • Removed the long-outdated support for Xbox* consoles.
Bugs fixed:
  • Fixed the issue with task_arena::execute() not being processed when the calling thread cannot join the arena.
  • Fixed dynamic memory allocation replacement failure on macOS* 10.12.
  • Fixed dynamic memory allocation replacement failures on Windows* 10 Anniversary Update.
  • Fixed emplace() method of concurrent unordered containers to not require a copy constructor.

You can download the latest Intel TBB version from http://threadingbuildingblocks.org and https://software.intel.com/en-us/articles/intel-tbb.

Update 1

Release Notes

Changes since Intel TBB 2017:

Bugs fixed:
  • Fixed dynamic memory allocation replacement failures on Windows* 10 Anniversary Update.
  • Fixed emplace() method of concurrent unordered containers not to require a copy constructor.
Initial Release

Release Notes

One of the best known C++ threading libraries Intel® Threading Building Blocks (Intel® TBB) was recently updated to a new release 2017. The updated version contains several key new features when compared to the previous 4.4 release. Some of them were already released in Intel® TBB 4.4 updates.

Licensing

Like Intel® TBB 2.0, the Intel® TBB coming in 2017 brings both technical improvements and becomes more open with the switch to an Apache* 2.0 license, which should enable it to take root in more environments while continuing to simplify effective use of multicore hardware.

Parallel algorithms static_partitioner

Intel® TBB 2017 has expanded a set of partitioners with the tbb::static_partitioner. It can be used in tbb::parallel_for and tbb::parallel_reduce to split the work uniformly among workers. The work is initially split into chunks of approximately equal size. The number of chunks is determined at runtime to minimize the overhead of work splitting while providing enough tasks for available workers. Whether these chunks may be further split is unspecified. This reduces overheads involved when the work is originally well-balanced. However, it limits available parallelism and, therefore, might result in performance loss for non-balanced workloads.

Tasks

Added tbb::task_arena::max_concurency() method returning the maximal number of threads that can work inside an arena. The amount of concurrency reserved for application threads at tbb::task_arena construction can be set to any value between 0 and the arena concurrency limit.

Namespace tbb::this_task_arena is a concept to collect information about arena where the current task is executed now. It is propagated with new functionality:

  • In previous releases to get a current thread slot index in the current arena a tbb::task_arena::current_thread_index() static method was used. Now it is deprecated and functionality was moved to tbb::this_task_arena. Use tbb::this_task_arena::current_thread_index() function now.
  • added this_task_arena::max_concurrency() that returns maximum number of threads that can work on the current arena.
  • (Preview Feature) Use tbb::this_task_arena::isolate() function to isolate execution of a group of tasks or an algorithm from other tasks submitted to the scheduler.
Memory Allocation

Improved dynamic memory allocation replacement on Windows* OS to skip DLLs for which replacement cannot be done, instead of aborting.

For 64-bit platforms, quadrupled the worst-case limit on the amount of memory the Intel® TBB allocator can handle.

Intel® TBB no longer performs dynamic replacement of memory allocation functions for Microsoft Visual Studio 2008 and earlier versions.

Flow Graph async_node

Now it’s a fully supported feature.

The tbb::flow::async_node is re-implemented using tbb::flow::multifunction_node template. This allows to specify a concurrency level for the node.

A class template tbb::flow::async_node allows users to coordinate with an activity that is serviced outside of the Intel® TBB thread pool. If your flow graph application needs to communicate to a separate thread, runtime or device, tbb::flow::async_node might be helpful. It has interfaces to commit results back, maintaining two-way asynchronous communication between a Intel® TBB flow graph and an external computing entity. tbb::flow::async_node class was a preview feature in Intel® TBB 4.4.

async_msg

Since Intel TBB 4.4 Update 3 a special tbb::flow::async_msg message type was introduced to support communications between the flow graph and external asynchronous activities.

opencl_node

Streaming workloads to external computing devices is significantly reworked in this Intel® TBB 2017 and introduced as a preview feature. Intel® TBB flow graph now can be used as a composability layer for heterogeneous computing.

A class template tbb::flow::streaming_node was added to the flow graph API. It allows a flow graph to offload computations to other devices through streaming or offloading APIs. The “streaming” concept uses several abstractions like StreamFactory to produce instances of computational environments, kernel to encapsulate computing routine, device_selector to access a particular device.

The following example shows a simple OpenCL* kernel invocation.

File sqr.cl

__kernel
void Sqr( __global float *b2, __global float *b3   )
{
    const int index = get_global_id(0);
    b3[index] = b2[index]*b2[index];
}

File opencl_test.cpp

#define TBB_PREVIEW_FLOW_GRAPH_NODES 1
#define TBB_PREVIEW_FLOW_GRAPH_FEATURES 1

#include <iterator>
#include <vector>
#include "tbb/flow_graph_opencl_node.h"
using namespace tbb::flow;

bool opencl_test()   {
   opencl_graph g;    
   const int N = 1 * 1024 * 1024;
   opencl_buffer<float>  b2( g, N ), b3( g, N );
   std::vector<float>  v2( N ), v3( N );

   auto i2 = b2.access<write_only>();
   for ( int i = 0; i < N; ++i ) {
        i1[i] = v1[i] = float( i );
   }
   // Create an OpenCL program
   opencl_program<> p( g, PathToFile("sqr.cl") ) ;
   // Create an OpenCL computation node with kernel "Sqr" 
   opencl_node <tuple<opencl_buffer<float>, opencl_buffer<float>>> k2( g, p.get_kernel( "Sqr" ) );
   // define iteration range
   k2.set_range( {{ N },{ 16 }} );
   // initialize input and output buffers
   k2.try_put( std::tie( b2, b3 ) );
   // run the flow graph computations
   g.wait_for_all();

    // validation
    auto o3 = b3.access<read_only>();
    bool comp_result = true;
    for ( int i = 0; i < N; ++i ) {
    	 comp_result &&= (o3[i] - v2[i] * v2[i]) < 0.1e-7;
    }
    return comp_result;
 }

Some other improvements in the Intel® TBB flow graph

  • Removed a few cases of excessive user data copying in the flow graph.
  • Reworked tbb::flow::split_node to eliminate unnecessary overheads.

Important note: Internal layout of some flow graph nodes has changed; recompilation is recommended for all binaries that use the flow graph.

Python

An experimental module which unlocks additional performance for multi-threaded Python programs by enabling threading composability between two or more thread-enabled libraries.

Threading composability can accelerate programs by avoiding inefficient threads allocation (called oversubscription) when there are more software threads than available hardware resources.

The biggest improvement is achieved when a task pool like the ThreadPool from standard library or libraries like Dask or Joblib (used in multi-threading mode) execute tasks calling compute-intensive functions of Numpy/Scipy/PyDAAL which in turn are parallelized using Intel® Math Kernel Library (Intel® MKL) or/and Intel® TBB.

The module implements Pool class with the standard interface using Intel® TBB which can be used to replace Python’s ThreadPool. Thanks to the monkey-patching technique implemented in class Monkey, no source code change is needed in order to unlock additional speedups.

For more details see: Unleash parallel performance of python programs

Miscellaneous
  • Added TBB_USE_GLIBCXX_VERSION macro to specify the version of GNU libstdc++ when it cannot be properly recognized, e.g. when used with Clang on Linux* OS.
  • Added support for C++11 move semantics to the argument of tbb::parallel_do_feeder::add() method.
  • Added C++11 move constructor and assignment operator to tbb::combinable class template.
Samples

All examples for commercial version of library moved online: https://software.intel.com/en-us/product-code-samples. Examples are available as a standalone package or as a part of Intel(R) Parallel Studio XE or Intel(R) System Studio Online Samples packages

  • Added graph/stereo example to demostrate tbb::flow::async_msg, and tbb::flow::opencl_node.

You can download the latest Intel TBB version from http://threadingbuildingblocks.org and https://software.intel.com/en-us/articles/intel-tbb.

Для получения подробной информации о возможностях оптимизации компилятора обратитесь к нашему Уведомлению об оптимизации.