Intel® Threading Building Blocks (Intel® TBB) 4.1

Simplify Parallelism with a Scalable Parallel Model

Intel® Threading Building Blocks (Intel® TBB) 4.1 is a widely used, award-winning C and C++ library for creating high performance, scalable parallel applications.

  • Enhance Productivity and Reliability - Rich set of components to efficiently implement higher-level, task-based parallelism
  • Gain Performance Advantage Today and Tomorrow - Future-proof applications to tap multicore and many-core power
  • Fits Within Your Environment - Advanced threading library, compatible with multiple compilers and portable to various operating systems

"Intel® TBB provided us with optimized code that we did not have to develop or maintain for critical system services. I could assign my developers to code what we bring to the software table—crowd simulation software.”
Michaël Rouillé, CTO, Golaem


Intel® TBB 4.1 yields linear scaling in these example applications


Also available as open source

Awards

Enhance Productivity and Reliability

Enhance Productivity and Reliability

Intel® TBB 4.1 provides abstractions that make it easier to write scalable and reliable parallel applications with fewer lines of code. Pre-tested algorithms, concurrent containers, synchronization primitives, and a scalable memory allocator simplify parallel application development. Intel® TBB delivers high performing and reliable code with less effort than hand-made threading.

The Intel® TBB 4.1 flow graph as well as generic parallel algorithms are customizable to a wide variety of problems. The Flow graph provides a flexible and convenient API for expressing static and dynamic dependencies between computations. It also extends the applicability of Intel® TBB 4.1 to event-driven/reactive programming models.


Gain Performance Advantage Today and Tomorrow
Design For Scaling

Gain Performance Advantage Today and Tomorrow

Intel® TBB 4.1 allows a developer to think of parallelism at the higher level avoiding dealing with low level details of threading. This makes Intel® TBB 4.1 based solutions independent of the number of CPU’s and allows for improved performance and scalability with the growing number of CPUs in the future.

Application performance can automatically improve as processor core count increases by using abstract tasks. The sophisticated Intel® TBB 4.1 task scheduler dynamically maps tasks to threads to balance the load among available cores, preserve cache locality, and maximize parallel performance. Intel® TBB 4.1 is optimized for multicore architectures and Intel® Many Integrated Core Architecture (Intel® MIC Architecture).


Interoperable
#Pragma Simd and Intel® TBB can be used together

Interoperable

Intel® TBB 4.1 is designed to co-exist with other threading packages and technologies (Intel® Cilk™ Plus, Intel® OpenMP, OS threads, etc.). Different components of Intel® TBB 4.1 can be used independently and mixed with other threading technologies. Intel® TBB 4.1 task scheduler and parallel algorithms support nested and recursive parallelism as well as running parallel constructs side-by-side. This is useful for introducing parallelism gradually and helps independent implementation of parallelism in different components of an application.


Portability

Organizations can expand their customer base by using a production-ready, open solution for parallelism that is available on a broad range of platforms. Intel® TBB is validated and commercially supported on Windows*, Linux*, and Mac OS* X platforms, using multiple compilers. It is also available on FreeBSD*, IA-based Solaris*, and PowerPC-based systems via the open source community.


Top Community Support
Order the Intel® Threading Building Blocks book online at amazon.com

Top Community Support

The broad support from an involved community provides developers access to additional platforms and OS’s. Intel® Premier Support services and Intel® Support Forums provide confidential support, technical notes, application notes, and the latest documentation.

A complete documentation package and code samples are readily available both as a part of Intel® TBB 4.1 installation and online at http://threadingbuildingblocks.org. The Getting Started Guide and the Tutorial provides an introduction into Intel® TBB 4.1. The Reference Manual contains a formal descriptions of all classes and functions implemented in Intel® TBB 4.1, while the Design Patterns discuss common parallel programming patterns and how to implement them using Intel® TBB 4.1.

What’s New in Intel® TBB 4.1

FeatureBenefit
Support for Latest Intel Architectures
Intel® Xeon® Processors and
Intel® Xeon Phi™ coprocessor

Selecting the best models for your application today will set a path for you to take full advantage of multicore and many-core performance without re-writing your code. Start today by implementing parallelism for today’s architecture and be ready for future architectures.

Improved Flow Graph

Additional exception safety and the ability to iterate over graph nodes is now included in the Flow Graph feature. This improves usability and reliability of the Flow Graph, making it applicable to more use cases.

Get Reproducible Results

Gain the confidence of more reproducible results with TBB’s new template function parallel_deterministic_reduce. Deliver floating-point arithmetic results that are reproducible run after run.

Additional C++11 Support

Intel is committed to supporting the C++11 standard and we have added more in this release. TBB can be used with C++11 compilers and supports lambda expressions.

New Examples and Documentation

New HTML & CHM TBB Reference Manual makes it easier to find the answers you need.

New examples demonstrate usage of major new features including logic_sim for the flow graph. Please visit http://threadingbuildingblocks.org to view and learn from the new examples.

What’s New in Intel® TBB 4.1

FeatureBenefit
Support for Latest Intel Architectures
Intel® Xeon® Processors and
Intel® Xeon Phi™ coprocessor

Selecting the best models for your application today will set a path for you to take full advantage of multicore and many-core performance without re-writing your code. Start today by implementing parallelism for today’s architecture and be ready for future architectures.

Improved Flow Graph

Additional exception safety and the ability to iterate over graph nodes is now included in the Flow Graph feature. This improves usability and reliability of the Flow Graph, making it applicable to more use cases.

Get Reproducible Results

Gain the confidence of more reproducible results with TBB’s new template function parallel_deterministic_reduce. Deliver floating-point arithmetic results that are reproducible run after run.

Additional C++11 Support

Intel is committed to supporting the C++11 standard and we have added more in this release. TBB can be used with C++11 compilers and supports lambda expressions.

New Examples and Documentation

New HTML & CHM TBB Reference Manual makes it easier to find the answers you need.

New examples demonstrate usage of major new features including logic_sim for the flow graph. Please visit http://threadingbuildingblocks.org to view and learn from the new examples.

Rich set of components for Performance and Productivity

Rich set of components for Performance and Productivity
Intel® TBB 4.1 Pre-Tested Capabilities

Parallel Algorithms
Generic implementation of common parallel performance patterns

Generic implementations of parallel patterns such as parallel loops, flow graphs, and pipelines can be an easy way to achieve a scalable parallel implementation without developing a custom solution from scratch.

Dynamic Task Scheduler
Engine that manages parallel tasks and task groups

Intel® TBB 4.1 task scheduler enables task-based programming and utilizes work stealing for dynamic workload balancing – a scalable and higher level alternative to managing OS threads manually. The implementation supports C++ exceptions, task/task group priorities, and cancellation which are essential for large and interactive parallel C++ applications.

Concurrent Containers
Generic implementation of common idioms for concurrent access

Intel® TBB 4.1 concurrent containers are a concurrency-friendly alternative to serial data containers. Serial data structures (such as C++ STL containers) often require a global lock to protect them from concurrent access and modification; Intel® TBB concurrent containers allow multiple threads to concurrently access and update items in the container increasing allowed concurrency and improving an application’s scalability.

Synchronization Primitives
Exception-safe locks, condition variables, and atomic operations

Intel® TBB 4.1 provides a comprehensive set of synchronization primitives with different qualities that are applicable to common synchronization strategies. Exception-safe implementation of locks helps to avoid a dead-lock in programs which use C++ exceptions. Usage of Intel® TBB atomic variables instead of the C-style atomic API minimizes potential data races.

Scalable Memory Allocators
Scalable memory manager and false-sharing free memory allocator

The scalable memory allocator avoids scalability bottlenecks by minimizing access to a shared memory heap via per-thread memory pool management. Special management of large (=8KB) blocks allows more efficient resource usage, while still offering scalability and competitive performance. The cache-aligned memory allocator avoids false-sharing by not allowing allocated memory blocks to split a cache line.

Create arbitrary task trees

When an algorithm cannot be expressed with high-level Intel® TBB 4.1 constructs, the user can choose to create arbitrary task trees. Tasks can be spawned for better locality and performance or en-queued to maintain FIFO-like order and ensure starvation-resistant execution.

Scalability with Future-proofing

  • Application performance automatically improves as processor core count increases by using abstract tasks. Sophisticated task scheduler dynamically maps tasks to threads to balance the load among available cores, preserve cache locality, and maximize parallel performance.
  • Intel® Threading Building Blocks 4.1 yields linear scaling in these example applications

Select the right Intel® TBB license
Available Commercially and as open source

Select the right Intel® TBB 4.1 license

  • Commercial Binary Distribution for customers who may require commercial support services. Attractive pricing available for academic, student and classroom usage.
  • Open Source Distribution can be used under GPLv2 with the runtime exception allowing usage in proprietary applications. Allows support for additional OSs and hardware platforms. Both source and binary forms are available for download from http://threadingbuildingblocks.org.
  • Custom license available if you require the ability to modify or distribute the commercial source code of Intel® TBB. Contact your Intel representative for more information.

Rich set of components for Performance and Productivity

Rich set of components for Performance and Productivity
Intel® TBB 4.1 Pre-Tested Capabilities

Parallel Algorithms
Generic implementation of common parallel performance patterns

Generic implementations of parallel patterns such as parallel loops, flow graphs, and pipelines can be an easy way to achieve a scalable parallel implementation without developing a custom solution from scratch.

Dynamic Task Scheduler
Engine that manages parallel tasks and task groups

Intel® TBB 4.1 task scheduler enables task-based programming and utilizes work stealing for dynamic workload balancing – a scalable and higher level alternative to managing OS threads manually. The implementation supports C++ exceptions, task/task group priorities, and cancellation which are essential for large and interactive parallel C++ applications.

Concurrent Containers
Generic implementation of common idioms for concurrent access

Intel® TBB 4.1 concurrent containers are a concurrency-friendly alternative to serial data containers. Serial data structures (such as C++ STL containers) often require a global lock to protect them from concurrent access and modification; Intel® TBB 4.1 concurrent containers allow multiple threads to concurrently access and update items in the container increasing allowed concurrency and improving an application’s scalability.

Synchronization Primitives
Exception-safe locks, condition variables, and atomic operations

Intel® TBB 4.1 provides a comprehensive set of synchronization primitives with different qualities that are applicable to common synchronization strategies. Exception-safe implementation of locks helps to avoid a dead-lock in programs which use C++ exceptions. Usage of Intel® TBB atomic variables instead of the C-style atomic API minimizes potential data races.

Scalable Memory Allocators
Scalable memory manager and false-sharing free memory allocator

The scalable memory allocator avoids scalability bottlenecks by minimizing access to a shared memory heap via per-thread memory pool management. Special management of large (=8KB) blocks allows more efficient resource usage, while still offering scalability and competitive performance. The cache-aligned memory allocator avoids false-sharing by not allowing allocated memory blocks to split a cache line.

Create arbitrary task trees

When an algorithm cannot be expressed with high-level Intel® TBB 4.1 constructs, the user can choose to create arbitrary task trees. Tasks can be spawned for better locality and performance or en-queued to maintain FIFO-like order and ensure starvation-resistant execution.

Scalability with Future-proofing

Scalability with Future-proofing

  • Application performance automatically improves as processor core count increases by using abstract tasks. Sophisticated task scheduler dynamically maps tasks to threads to balance the load among available cores, preserve cache locality, and maximize parallel performance.
  • Intel® Threading Building Blocks 4.1 yields linear scaling in these example applications

Select the right Intel® TBB license
Available Commercially and as open source

Select the right Intel® TBB 4.1 license

  • Commercial Binary Distribution for customers who may require commercial support services. Attractive pricing available for academic, student and classroom usage.
  • Open Source Distribution can be used under GPLv2 with the runtime exception allowing usage in proprietary applications. Allows support for additional OSs and hardware platforms. Both source and binary forms are available for download from http://threadingbuildingblocks.org.
  • Custom license available if you require the ability to modify or distribute the commercial source code of Intel® TBB. Contact your Intel representative for more information.
  • What is parallelism?
  • By way of analogy, if you’re trying to cook a multi-course meal and all you have is a single-burner stove, you can only cook one part of the meal at a time. If you have a stove with four burners, you can cook four things at once, bringing them to the table all at the same time. Software parallelism is similar. There are different techniques used to achieve parallelism—threading is one of them. The idea is to take an application and, where it is possible, split up the program so different parts can run simultaneously on different processors in a multicore configuration. Then, part of the application brings all the parts together to present the application results.

    Task-based parallelism is a mechanism to execute several work items (tasks) in parallel.

  • What is threading?
  • Threading is a kind of parallelism. It’s a technique used by software developers to decompose applications into parts that can be run simultaneously on a computer with multiple processors or multiple cores. Threaded applications run on a single computer, again with multiple cores, under the management of a single operating system.

  • Why does C++ need TBB?
  • C++, like other popularly used languages, was not designed to express parallelism. Fortunately, C++ is extensible using templates. Developers liked the OpenMP concept, whereby they could get scalable performance without adding much new code, yet needed something that was more conducive to the object oriented/template based programming style of C++. Developers wanted us to do something about parallel containers, and algorithms – so templates were a perfect fit. The ‘generic programming’ style which STL uses – which allows components to be easily composed without giving up performance appealed a great deal to us. We settled on extending C++ in a fashion similar to how STL extended C++.

    Abstraction is important to developers. Using native threads, doing your own explicit thread management, is like assembly language for parallelism. TBB is the abstraction we need for many reasons. Programming for parallelism using native threads is tedious, error prone and not portable. It also is seldom as scalable as it could be, because high levels of scalability are more difficult to program.

  • Does Intel have a run time fees for its libraries?
  • No.

  • Are there analysis tools that understand the semantics of Intel® TBB 4.1?
  • Yes. Applications threaded with Threading Building Blocks 4.1 can be analyzed with Intel® VTune™ Amplifier XE 2013 and Intel® Inspector XE 2013. Intel® Advisor XE 2013, available in any Intel® Studio XE product, can help find regions with the greatest performance potential from parallelism.

  • Where can I get an evaluation copy of Intel® TBB 4.1?
  • 30 day evaluation versions of Intel® Software Development Products are available for free download. You can get free support during the evaluation period by creating an Intel® Premier Support account after requesting the evaluation license. Click here for Windows, Linux or Mac OS X

  • Are there any books to help developers better understand how to use TBB?
  • A book on Intel Threading Building Blocks, by James Reinders has been published by O’Reilly Media. The tutorial, examples and other documentation which come with the download are excellent resources.

  • How do the commercial and open source versions differ?
  • Currently there are no product differences between the commercial and open source version. We maintain one source base and do builds for both from the same source base. The versions differ in the level support.

    Intel® TBB 4.1 is offered commercially for Windows, Linux or Mac OSX customers who want additional support or cannot follow the open source GPLv2 with the runtime exception license. With a purchase, customers will receive one year of product updates available at the Intel® Registration Center and one year of technical support from Intel® Premier Support, our interactive issue management and communication website. This premier support service allows you to submit questions to Intel engineers. In addition to Intel Premier Support, we have user forums, technical notes, application notes, and documentation available on our website.

    The open source support includes user forums, documentation and additional resources available on our website, threadingbuildingblocks.org.

  • Where can I get more information on the TBB open source offering?
  • For more information, visit threadingbuildingblocks.org. The site includes TBB source code, documentation, user forums, blogs, podcasts, articles, white papers and support areas.

  • What is parallelism?
  • By way of analogy, if you’re trying to cook a multi-course meal and all you have is a single-burner stove, you can only cook one part of the meal at a time. If you have a stove with four burners, you can cook four things at once, bringing them to the table all at the same time. Software parallelism is similar. There are different techniques used to achieve parallelism—threading is one of them. The idea is to take an application and, where it is possible, split up the program so different parts can run simultaneously on different processors in a multicore configuration. Then, part of the application brings all the parts together to present the application results.

    Task-based parallelism is a mechanism to execute several work items (tasks) in parallel.

  • What is threading?
  • Threading is a kind of parallelism. It’s a technique used by software developers to decompose applications into parts that can be run simultaneously on a computer with multiple processors or multiple cores. Threaded applications run on a single computer, again with multiple cores, under the management of a single operating system.

  • Why does C++ need TBB?
  • C++, like other popularly used languages, was not designed to express parallelism. Fortunately, C++ is extensible using templates. Developers liked the OpenMP concept, whereby they could get scalable performance without adding much new code, yet needed something that was more conducive to the object oriented/template based programming style of C++. Developers wanted us to do something about parallel containers, and algorithms – so templates were a perfect fit. The ‘generic programming’ style which STL uses – which allows components to be easily composed without giving up performance appealed a great deal to us. We settled on extending C++ in a fashion similar to how STL extended C++.

    Abstraction is important to developers. Using native threads, doing your own explicit thread management, is like assembly language for parallelism. TBB is the abstraction we need for many reasons. Programming for parallelism using native threads is tedious, error prone and not portable. It also is seldom as scalable as it could be, because high levels of scalability are more difficult to program.

  • Does Intel have a run time fees for its libraries?
  • No.

  • Are there analysis tools that understand the semantics of Intel® TBB 4.1?
  • Yes. Applications threaded with Threading Building Blocks can be analyzed with Intel® VTune™ Amplifier XE 2013 and Intel® Inspector XE 2013. Intel® Advisor XE 2013, available in any Intel® Studio XE product, can help find regions with the greatest performance potential from parallelism.

  • Where can I get an evaluation copy of Intel® TBB 4.1?
  • 30 day evaluation versions of Intel® Software Development Products are available for free download. You can get free support during the evaluation period by creating an Intel® Premier Support account after requesting the evaluation license. Click here for Windows, Linux or Mac OS X

  • Are there any books to help developers better understand how to use TBB?
  • A book on Intel Threading Building Blocks, by James Reinders has been published by O’Reilly Media. The tutorial, examples and other documentation which come with the download are excellent resources.

  • How do the commercial and open source versions differ?
  • Currently there are no product differences between the commercial and open source version. We maintain one source base and do builds for both from the same source base. The versions differ in the level support.

    Intel® TBB 4.1 is offered commercially for Windows, Linux or Mac OSX customers who want additional support or cannot follow the open source GPLv2 with the runtime exception license. With a purchase, customers will receive one year of product updates available at the Intel® Registration Center and one year of technical support from Intel® Premier Support, our interactive issue management and communication website. This premier support service allows you to submit questions to Intel engineers. In addition to Intel Premier Support, we have user forums, technical notes, application notes, and documentation available on our website.

    The open source support includes user forums, documentation and additional resources available on our website, threadingbuildingblocks.org.

  • Where can I get more information on the TBB open source offering?
  • For more information, visit threadingbuildingblocks.org. The site includes TBB source code, documentation, user forums, blogs, podcasts, articles, white papers and support areas.

Intel® Threading Building Blocks 4.1

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Product Documentation - View documentation online!