need something like a sorted parallel_do

need something like a sorted parallel_do

I need something like a sorted parallel_do. As it looks like concurrent_priority_queue fits. But I would have to deal with thread pools myself, would't I?

9 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

The example uses task_group. But to use this I would have to count the running tasks myself and would have to know the number of CPUs.

parallel_do() is an algorithm, and concurrent_priority_queue is a data structure.

You can run parallel_do() from a concurrent_priority_queue.

I did not see any interface of parallel_do where I can pass the container to be used for the parallel_do_feeder<item_type>::add implementation.

ok I think I found a solution:

I use a dummy container to be passed to parallel_do -- the contents don't matter -- so one could use boost::integer_range. The range must be of the same size like the number of elements contained in the concurrent_priority_queue before the parallel_do() is invoked. Inside the body function I call tbb::parallel_do_feeder::add(dummy_argument) after adding the real argument to the concurrent_priority_queue. At the start of the body function I pop one argument from the concurrent_priority_queue -- try_pop() must succeed. Again the argument passed to the body is ignored.


Can anybody come up with a solution, which does not require the unused container hidden inside parallel_do?

The easiest solution is with a small adapter to the unfortunately "deprecated" parallel_while(), translating pop_if_present() or somesuch to the container's try_pop().

I would not recommend relying on any reported current length of the container: unless I am mistaken, there is no assurance that this length does not overestimate the actual number of elements present, so try_pop() might conceivably fail. It could be that I just didn't get the point, but there should also be no need to resort to the feeder.

I recommend the easier solution for now. You can always come up with something better if and when parallel_while() is removed, and by that time the development team may have provided, or "somebody" may have contributed, an easier solution.


Raf Schietekat wrote:

The easiest solution is with a small adapter to the unfortunately "deprecated" parallel_while(), translating pop_if_present() or somesuch to the container's try_pop().

Raf, I'm not sure why you think that deprecating parallel_while is unfortunate. It's functionality is covered by either parallel_do or a simple two-stage parallel_pipeline, unless someone wants both an input stream without iterators and the possibility to add work on the fly. And even that can possibly be worked around if on-the-fly work could be added to the stream directly, or e.g. to a concurrent_queue shared between the input filter and the processing filter.

My recommendation for "something like a sorted parallel_do" is a combination of parallel_pipeline with concurrent_priority_queue. The queue defines the order; the first pipeline stage pops items out of the queue in that order, and the second pipeline stage processes the items. And in the case when some work needs to be added on the fly, it might be added to the initial queue, with some extra care necessary as the input filter might already take everything out of the queue and signal the pipeline to stop; e.g. a loop around the pipeline that checks if the queue is empty, and re-runs the pipeline if it's not, would be sufficient.

Another alternative is a combination of parallel_for with concurrent_priority_queue if the latter is not filled concurrently to execution. In this case, you know the number of items in the container and can run parallel_for(0, pq.size(), body);

I think that it is easier to work with parallel_while() than with parallel_do(). For some, inventing an ad-hoc iterator is trivial, but it makes me at least scratch my hair, and for others it will be a big or insurmountable hurdle. This is a prime example: the simplest glue is an fairly trivial class sitting between concurrent_priority_queue and concurrent_while().

A pipeline would be enough here (no feeder needed), but not in the general case. Also, in the general case, feeding back into a concurrent_queue would be bad for performance because new items should probably be processed while they are still hot in cache (curiously, there's no concurrent_stack).

Maybe you could ship an input_iterator with concurrent_priority_queue, but that still leaves other uses for parallel_do() unaddressed. There should at least be example code in the Reference Manual to construct an ad-hoc input iterator (copy&paste plus modify), but it won't be minimal. Maybe it's possible to provide an input iterator that is a class template that takes a user-provided function that would call try_pop() or whatever has to be done?

Leave a Comment

Please sign in to add a comment. Not a member? Join today