Intel Parallel Studio

Parallelizing Simple Loops

The simplest form of scalable parallelism is a loop of iterations that can each run simultaneously without interfering with each other. The following sections demonstrate how to parallelize simple loops.


Intel® Threading Building Blocks (Intel® TBB) components are defined in namespace tbb. For brevity’s sake, the namespace is explicit in the first mention of a component, but implicit afterwards.

Reader Writer Mutexes

Mutual exclusion is necessary when at least one thread writes to a shared variable. But it does no harm to permit multiple readers into a protected region. The reader-writer variants of the mutexes, denoted by _rw_ in the class names, enable multiple readers by distinguishing reader locks from writer locks. There can be more than one reader lock on a given mutex.

How Task Scheduling Works

The scheduler evaluates a task graph. The graph is a directed graph where each node is a task. Each task points to its successor, which is another task that is waiting on it to complete, or NULL. Method task::parent() gives you read-only access to the successor pointer. Each task has a refcount that counts the number of tasks that have it as a successor. The graph evolves over time.

Thread Safety

Unless otherwise stated, the thread safety rules for the library are as follows:

  • Two threads can invoke a method or function concurrently on different objects, but not the same object.

  • It is unsafe for two threads to invoke concurrently methods or functions on the same object.

Subscribe to Intel Parallel Studio