Intel TBB


A loop can do a reduction, as in this summation:

float SerialSumFoo( float a[], size_t n ) {
    float sum = 0;
    for( size_t i=0; i!=n; ++i )
        sum += Foo(a[i]);
    return sum;

If the iterations are independent, you can parallelize this loop using the template class parallel_reduce as follows:

Task Scheduler Summary

The task scheduler works most efficiently for fork-join parallelism with lots of forks, so that the task-stealing can cause sufficient breadth-first behavior to occupy threads, which then conduct themselves in a depth-first manner until they need to steal more work.

Appendix A Costs of Time Slicing

Time slicing enables there to be more logical threads than physical threads. Each logical thread is serviced for a time slice by a physical thread. If a thread runs longer than a time slice, as most do, it relinquishes the physical thread until it gets another turn. This appendix details the costs incurred by time slicing.


The following table provides additional information on the members of the concurrent_unordered_map and concurrent_unordered_multimap template classes.
Assine o Intel TBB