Making parallel_while Better?

Making parallel_while Better?

I may be speaking out of turn as I haven't looked at the template class parallel_while or seen anything but what appears in the tutorial to a point. However, what I've seen suggests that usability of parallel_while is not ideal. I'm going to suggest an alternative which may not be possible, but if not, it may be of some value in considering improvements.

As illustrated in the tutorial, one must create a parallel_while instance, a Stream, and a Body of type T. The Body may call parallel_while::add() to add new items to the set, if constructed appropriately. To activate the loop, one calls parallel_while::run() with the Stream and Body.

That would be easier to manage if parallel_while were a function template that looks like this:

parallel_while(Stream & _stream, Body & _body)
typename parallel_while_manager manager;
_body.set_manager(manager);, _body);

Template metaprogramming should be able to determine whether Body defines set_manager() and can dispatch to a helper function that calls it or not accordingly. Thus, if the Body wants to be able to augment the loop, it can define set_manager() and then call add() on that object as appropriate. If not, the Body can omit the function and it won't be called.

This approach ensures that the objects are assembled correctly and makes parallel_while a function template like parallel_for() and parallel_reduce(). Have I missed anything that makes this untenable?

7 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Yes, I believe we could add a template function largely as you describe:

#include "tbb/parallel_while.h"
void parallel_while_func( Stream & _stream, Body & _body ) {
parallel_while manager;
_body.set_manager( manager ); // having the set_manager() function
// would be a new constraint on Body _stream, _body );

We'll consider adding this based on feedback from users.

Unfortunately, your solution leads to naming confusion: "parallel_while" isn't analogous to "parallel_for" and "parallel_reduce." Instead, the analogue is "parallel_while_func." I realize I'm advocating a break with the existing naming, thus breaking existing code, but it would be pretty easy for people to retrofit their code: s/parallel_while/parallel_while_manager/. New code, of course, can be written with the function template parallel_while from the start.

You are right about naming consistency, but please remember that not only Intel offers TBB to the community but also we have to support our existing commercial customersfrom the same code base.I bet they wouldn't be pleased to change their code each time we make another nice naming change. Unless wefigure out a way to make a function and a class with the same name automatically distinguished by any C++ standard compliant compiler, they will have to have different names (and guess what will be named parallel_while? :-) ); or either we wait with adding this function until we decide to break backward compatibility in many places at once.

And, thank you for providing feedback - it's appreciated!

The TBB team further pursued the DEADBEEF's idea of providing an alternative version of parallel_while as a function. I am going to describe here the solution we come up with; but first, I should tell about our findings of what is not possible to do.

  • To our knowledge, no template metaprogramming technique is capable of checking for method existence in an arbitrary class. So the proposal to eliminate set_manager() method in a Body that does not add more work is unfeasible without imposing additional (and artificial) requirements to the Body, which seems unreasonable.
  • The C++ standard requires that a template name declared in a namespace scope or in class scope should be unique in that scope. Thus there is no way to have the same name for the parallel_while class and the new function in tbb namespace. Moving the class name into another namespace is not a good solution either: with 'using' directive it still would be too easy to run into compilation errors.

So for sake of backward compatibility, we have to choose another name for the function. Possible names derived from the current class name could be parallel_while_func, parallel_while_ex, or even parallel_whilst. It might be better, however, to choose different name, for brevity and minimizing confusion. We tentatively named the function 'parallel_apply', and I will use this name here; the final name is not yet decided, and we would appreciate feedback.

The new function is defined as

void parallel_apply( const Body& body, Stream& stream );

Obviously its semantics is "applying the body to each item in the stream".

The requirements imposed by the algorithm to a Stream are the same as those of parallel_while. The requirements to a Body slightly changed. With parallel_apply, two forms of function call operator are allowed for a Body class. The first form should be defined by conforming Body classes that add more work on the fly. The example below shows both how the operator() should be declared and also how to augment the workpile from it:

void Body::operator()( Body::argument_type& item, parallel_apply_input& input ) const
    // ... do some processing here;
    // assume new_item references a new piece of data to process
    input.add( new_item );
    // ... 

The tbb::parallel_apply_input class referenced above defines the method 'add' used by the body object to augment the workpile:

class parallel_apply_input
    void add( Item& item );

For better scalability we recommend designing algorithms that augment the workpile during processing. Nevertheless, for a Body class that does not add work the simplified form of the function call operator can be defined:

void Body::operator()( Body::argument_type& item ) const;

The parallel_apply implementation will recognize which form of operator() is defined by a particular Body, and will make the appropriate call.

In this design of the algorithm, we solved the issue of consistency with other TBB algorithms, made the concept of adding work on the fly more visible and easy to use, yet provided enough flexibility for users who don't need to use this concept.

Feedback is welcomed.

I did a little research and I think you are right that you can't discover the presence of a member function on an arbitrary class. It requires support from outside via a trait or similar. Imposing a base class could permit overloading via a dispatch function to the appropriate implementation, with new code forced to derive. Even then, extant code using your existing interface would have to be changed, which you are rightly loath to do, since the name parallel_while would change from a class template to a function template.

"parallel_apply" is an interesting name and your semantic description is apt. The name seems more fitting besides. Another name to consider is "parallel_map," with reversed semantic description: mapping each Stream item onto the Body. In that case, reversing the arguments might be likewise sensible:

parallel_map(Stream & _stream, Body const & _body);

(I prefer the non-const reference to come first, but consistency should rule.)

Either way, the new approach is much simpler than the old and will make an excellent addition to the library. I suggest that this functionality be part of your standard proposal, supplanting the existing parallel_while, and that you give it the best name there, regardless of current APIs.

See this new thread for the latest good news about the topic.

Leave a Comment

Please sign in to add a comment. Not a member? Join today