I would like to introduce a contribution to TBB, available via the TBB Community Project as TBB Contributed Code. The code may be downloaded via svn checkout, find instructions here: http://code.google.com/p/tbbcommunity/source/checkout
Most of my postings have focused on components relating to the simulation project YetiSim, and I have also asked many other questions on this forum. As you may or may not know, I have been abstracting the core of YetiSim into TBB components so that YetiSim will become a simple application of new components, and so that the TBB community may benefit from an abstraction which is useful (hopefully) for more than just simulation.
Today I would like to introduce you to tcc::parallel_for_group (tcc means TBB Community Code). The concept behind parallel_for_group is that you have a composite type, this might be a collection of items, a robot, a machine, or a game board for Conways Game of Life (as in my simple example in the source tree). In concept you want to separate the composite type into multiple grouped parts, and an ungrouped type. The grouped parts and ungrouped parts might be of different types.
Processing may proceed in parallel on the grouped parts, and on the ungrouped part. By requirement, each group is able to be processed independently, and so can ungrouped parts. The algorithm template parallel_for_group will execute a function object with appropriate operator() overloads on ungrouped and grouped items at the same time.
There are many reasons to group items. Perhaps there are dependencies which must be considered, so some initial work is done to find dependencies which will be made up in parallel processing time. Perhaps we wish to make smaller copies of data which have just the information required for processing. For example perhaps we are performing matrix operations, and we divide the matrix into blocks with each block requiring local copies of the information passed to it.
The basic algorithm template is:
typename SeparatorType::separation_type parallel_for_group(const CompositeType& composite, const SeparatorType& separator, const Body& body, const Partitioner& partitioner)
The separator type passed must support the following operation:
separation separate(const CompositeType&)
This function will take a CompositeType, and separate it into un/grouped items and return a struct which contains that information. The separator must publicly inherit from a provided separator class. There is no runtime cost of this class, because it only holds typedefs for later usage by the algorithm templates (I'm not a template guru, so maybe there is a cleaner way).
The result of the separation is fed into a body, which must support the following:
void operator()(SeparatorType::grouped_range_type& x)
void operator()(SeperatorType::ungrouped_range_type& x)
Notice that currently the SeparatorType is passed to parallel_for_group, this is temporary, because I would prefer to return a reference to the separation rather than a copy.
Also please note I have adjusted tbb::blocked_range and tbb::parallel_for to support non-const ranges. The reason for this, is so that users c
an pass un/grouped items which may be directly manipulated by the body. The user can enforce const-ness in the range by specifying that the un/grouped items are pointers to const.
Anyways, this is a basic introduction. Please have a look at the code, and share your thoughts. I have only provided an example here, and wanted to share the code for further input. There is more work to be done on the construct, and I will be implementing parallel_reduce_group in a similar manner.
I'm not the greastest person at communicating technically, so please ask questions and let me answer them. Check out the code, and enjoy!