Intel® Threading Building Blocks (Intel® TBB)

Get advanced threading for fast, scalable parallel applications

  • Parallelize computationally intensive work, delivering higher-level and simpler solutions using standard C++.
  • Most feature-rich and comprehensive solution for parallel application development.
  • Highly portable, composable, affordable, and approachable and also provides future-proof scalability.
  • Take advantage of Priority Support―connect privately with Intel engineers for technical questions.

Customer Testimonials

"DreamWorks* Fur Shader used Intel® TBB, which produced an average of five times the speedup on a fur generation loop.”


"Intel TBB was an invaluable help in multithreading our in-house renderer CGI Studio* and is now also used in animation and simulation software. Beside the ease of use, it takes care of the two most important aspects of running an application on multiple cores: load balancing and scalability.”

Maurice van Swaaij
Blue Sky Studios*

“Using Intel TBB’s new flow graph feature, we accomplished what was previously not possible, parallelizing a very sizable task graph with thousands of interrelationships all in about a week.”

Robert Link
GCAM Project Scientist
Pacific Northwest National Laboratory

“The Intel Threading Building Blocks flow graph interface lets us quickly add parallelism to GE healthcare ultrasound products and get great performance. The features and flexibility of the flow graph interface let us express the dependencies between our image calculations, exposing the parallelism in the computation, without the need to hand-roll a complex threading layer.”

Paul O’Dea
Software Architect
GE Healthcare

"Intel TBB provided us with optimized code that we did not have to develop or maintain for critical system services. I could assign my developers to code what we bring to the software table.”

Michaël Rouillé

Case Studies

University of Bristol Accelerates Rational Drug Design

Using Intel TBB, the University of Bristol slashes calculation time for drug development—enabling a calculation that once took 25 days to complete to run in just one day.

Intel TBB: The Backbone of CAD Exchanger Parallelism

Parallelism brings CAD Exchanger software dramatic gains in performance and user satisfaction, plus a competitive advantage. “CAD Exchanger is broadly using multithreaded algorithms to increase performance on multicore systems,” said Roman Lygin of CADEX, Ltd. “This is the key advantage over our competitors.”

Benchmarks show how it outperforms earlier editions in significant ways:

  • Some heavyweight computational algorithms, such as blended surface approximation, were accelerated by fifteen times over a single-thread mode.
  • Multithreaded visualization significantly increased the responsiveness of the GUI application, which in turn improved the user experience. Less time spent waiting means more time to interact and innovate.
  • Parallel file I/O is two and a half times faster and visualization time was reduced by up to four times.

Intel TBB helps Johns Hopkins University Prepare for a Manycore Future

Modern DNA sequencing provides an inexpensive and high-resolution window into diverse aspects of biology, genetics, and disease. Like a microscope, a sequencer produces a snapshot of a collection of cells. Unlike a microscope, a sequencer does not provide a finished, ready-to-interpret image. Rather, it produces billions of tiny snippets (reads) of DNA that must first be composed into longer, interpretable units such as genes or chromosomes. Bowtie and Bowtie 2 are widely used software tools produced in the university’s Langmead Lab that allow biologists to piece together the fragmentary evidence generated by DNA sequencers.

Johns Hopkins and Intel have been collaborating on the Bowtie 2 application. Adding parallelism via Intel TBB resulted in a substantial speedup of the application. By splitting reads from parsing in a critical section, the team saw essentially ideal scaling up to 120 threads.

The team was able to effectively prepare these core genomics software tools for the manycore future around the corner.

Virtual Population Growth: Intel TBB Drives Innovation in Crowd Simulation

“Some years ago, I had to write a library similar to Intel TBB for a cross-platform distributed 3D engine. It took me three months to code and debug the whole thing,” Rouillé said. “With Intel TBB, the guys writing our CPU code provided me with optimized code that I did not have to develop or maintain for critical system services, so I could focus my developers on coding innovations in our key technology.”

Mentor Graphics* Speeds Design Cycles with Intel® Software Tools

Thermal simulations get the performance boost for faster time to market. Mentor Graphics* achieved a significant improvement of nearly two times, even on one core, through code optimization based on the insight provided by Intel® VTune™ Amplifier XE. Good scalability resulted from a combination of Intel TBB and OpenMP* parallelization techniques. More than eight times the performance of the previous version was achieved on eight cores, and up to eleven times the performance on 16 cores. Bottlenecks were overcome in memory allocation with the use of the Intel TBB library. Utilization of the tbb::task concept allowed Mentor Graphics to parallelize complex algorithms in a way that had not been possible with the OpenMP paradigm.

University of Southern California Students Use Intel® Game Development Tools to Increase Game Performance

The University of Southern California (USC) GamePipe Laboratory is part of the USC Viterbi School of Engineering. It is one of the leading game development degree programs in the United States. Midway through the project, the team began to notice performance degradation as more assets were added to the game. Through the use of Intel TBB, an average performance increase of 15 to 20 percent was realized across most test systems, with as much as a 100 percent increase on systems that were particularly CPU-bottlenecked. This resulted in a smoother, more responsive experience as the input handling was now processed synchronously with rendering and combined with higher frame rates, which led to a reduction in input lag.

View all Case Studies