parallel_do: the replacement to parallel_while

parallel_do: the replacement to parallel_while

Alexey Kukanov (Intel)'s picture

Old inhabitants of the forum can probably recall the thread about Making parallel_while betterwhere we discussed if parallel_while algorithm can be implemented as a function instead of an object.

Now I am glad to say that we completed reworking the algorithm. Under the name of parallel_do and parallel_do_feeder, the function and its accompanying class that adds work on the fly first appeared in the October development update.

Please give it a try; andyour feedback is certainly appreciated.

Documentation on the algorithm will show up in one of subsequent updates.

12 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Alexey Kukanov (Intel)'s picture

In an internal discussion at Intel amongst TBB developers, we identified a possible interface change for parallel_do. Instead of requiring a user to provide initial set of data via Stream interface (see TBB documentation on parallel_while for definition of the Stream concept), an idea emerged to use two iterators, like in std::for_each.

The new parallel_do interface would be:

template 
void parallel_do( InputIterator first, InputIterator last, Body body );

so that it could be called e.g. like:

parallel_do( my_container.begin(), my_container.end(), my_do_body );

On one hand, the change makes tbb::parallel_do supporting iterators, and makes it closer to STL algorithms. On the other hand, for those who already use parallel_while, transition to parallel_do can be little harder.

Please tell us what do you think of this proposal.

Aside from the curious reversal of the template parameters, parallel_do provides exactly the interface I had suggested. The name is meaningful yet distinct from the old parallel_while. Excellent work.

I like the iterator interface. It avoids one having to create yet another class. It is likely that the Stream type would simply adapt the interface of a container in many cases, so the proposed interface streamlines things.

However, why must you change the interface? Why not overload parallel_do and leave the choice to library users?

Alexey Kukanov (Intel)'s picture

I shouldn't have said the word completed in my initial post :), definitely improvements will follow. We also noticed the unintentional template parameters reversal and will fix it. Other improvements are proper access restrictions to data and methods not intended to be public, other smaller code changes, also documentation in the Reference manual, examples, etc.

One of the things to decide is the iterator interface. You are right that overloading is possible, and it was our initial idea as well. However Arch Robison, the TBB architect, objected with the following note:

"Im against the overload until C++ 200x comes along. The reason is that I find it unsettling that two function signatures should treat one of the function arguments radically different depending only upon the number of arguments. I could be wrong, but I believe there are no instances of such an overloading in the ISO C++ standard. Remember that the arguments type are unconstrained template arguments, hence no overloading on type is possible. In C++ 200x, well have overloading on concepts, and then the overloading will not be so unsettling."

I will leave detailed explanations to Arch.

I recognize the architect's concern about overloading with unconstrained arguments. However, different behavior based upon number of arguments is common and the arguments are used differently. Compilation will fail if the type isn't iterator in the one case and Stream in the other. You can use poor man's concepts now (see Boost techniques) to make failures that more readily declare misuse than would a failure deep down in some implementation detail templated code, of course.

You can opt to wait for C++0x to get real concepts, but a great many folks will not upgrade to a concepts compliant compiler until many years hence. Why deny this useful functionality now to get even better diagnostics later?

What would be more confusing, to me at least, is yet another template function name to differentiate the two argument lists. They do precisely the same thing. All that differs is how one feeds work to them.

Alexey Kukanov (Intel)'s picture

DEADBEEF:What would be more confusing, to me at least, is yet another template function name to differentiate the two argument lists. They do precisely the same thing. All that differs is how one feeds work to them.

I agree with you here.

All the said above explains why we think of dropping Stream as the initial feeding interface, in favor of iterators. For parallel_do we do not yet have legacy users that would be affected. And if we provide two parallel_do interfaces now, we then will have to support both, despite that one of those might not be of much use.

Frankly, though migration from parallel_while won't be straightforward with the interface change to iterators in parallel_do, I do not think it would be too hard either. However, there could be complicated cases, for example, if tbb::concurrent_queue was used as the feeding Stream.

MADakukanov:
All the said above explains why we think of dropping Stream as the initial feeding interface, in favor of iterators.

That certainly eliminates the concern, but is it wise?

MADakukanov:
Frankly, though migration from parallel_while won't be straightforward with the interface change to iterators in parallel_do, I do not think it would be too hard either. However, there could be complicated cases, for example, if tbb::concurrent_queue was used as the feeding Stream.

Don't forget those comfortable with the Stream concept that would prefer to use parallel_do just as they use the other functionality. Being able to reuse a Stream seems beneficial.

Arch D. Robison (Intel)'s picture

As the architect, I have to keep a lid on complexity. Piling on more and more features (as marketing would be too happy to do!) raises the learning curve for everyone, adds to the testing burden, and contributes to compile-time or run-time bloat. And once a feature is added, it's tough to remove it. My philosophy is when in doubt, leave it out. I'm Scrooge.

I like to keep things lean and fast too because in most cases, users are going to wrap our stuff in their own wrappers anyway, because different users have different requirements. Or they need a layerdelineating isolation from a vendor's library. Keeping TBB simple makes wrapping it simple. E.g., uganson was able to write a small wrapper for class pipeline that gains typesafety at the expense of some flexibility. That's the sort of thing I expect users to do, and proof of easy wrapping.

I disagree that different "different behavior based upon number of arguments is common". I skimmed chapter 25, Algorithms library"of the C++ standard. I did not see any algorithms where changing the number of arguments changed the interpretation of an argument. Changing argument interpretationbased on number of argumentsalso thwarts handy tools like Visual C++'s Intellisense.

parallel_while will be deprecated, but not yet gone. Deprecation serves as notice that it might disappear in the future. In practice, it willl probably be around for many years, just like most deprecated C++ features.So in the meantime codes using parallel_while will still work.

It would not be difficult to write an adaptor that creates an input iterator that iterates over a parallel_while-style stream. If we're minimalist with the number of signatures required by parallel_do, such a class can be small. For example, the following definition would suffice for the iterator, even though it falls short of being a full ISO input iterator.

template<typename Stream>

class tbb_stream_iterator: public std::iteratortypename Stream::value_type> {

Stream* my_stream;

typedef typename Stream::value_type value_type;

value_type my_item;

public:

// Construct input iterator representing end of stream.

tbb_stream_iterator() : my_stream(NULL) {}

// Construct input iterator representing front of stream.

tbb_stream_iterator( Stream& stream ) : my_stream(&stream) {

operator++();

}

bool operator==( const tbb_stream_iterator& other ) const {return my_stream==other.my_stream;}

const value_type& operator*() const {return my_item;}

const tbb_stream_iterator& operator++() {

if( !my_stream->pop_if_present(my_item) )

my_stream = NULL;

return *this;

}

};

Another problem I have with enabling two forms for the input stream is that we are applying lipstick to a frog. The stream model is inherently sequential, and thus flawed for scalable parallel programming.* The "feeder" part of parallel_do is what makes it scalable. Making the non-scalableaspect of parallel_do easier to use seems like misplaced effort.

- Arch

*Yes, we provide concurrent_queue. It's because it's occasionally useful. But queues are very much overrated as adata structurefor parallel programming. By definition they are bottlenecks. The joke I use when giving talks is that a queue is a data structure for ensuring data gets cold in cache before you use it.

MADadrobiso:
I disagree that different "different behavior based upon number of arguments is common". I skimmed chapter 25, Algorithms library"of the C++ standard. I did not see any algorithms where changing the number of arguments changed the interpretation of an argument.

I didn't mean to suggest that it was common in the Standard, just that it was common in C++. Thus, it should be no surprise to TBB clients.

MADadrobiso:
Changing argument interpretationbased on number of argumentsalso thwarts handy tools like Visual C++'s Intellisense.

How? Such tools would display the alternatives from what was typed. The user can select among those, right? (I don't use MSVC's IDE, so I am not speaking from experience.)

MADadrobiso:
parallel_while will be deprecated, but not yet gone. Deprecation serves as notice that it might disappear in the future. In practice, it willl probably be around for many years, just like most deprecated C++ features.So in the meantime codes using parallel_while will still work.

This is true, of course, but not germaine, I think. If the feature is valuable, include it and don't worry about removing it later. If it isn't valuable enough to include, then don't. (See below.)

MADadrobiso:
It would not be difficult to write an adaptor that creates an input iterator that iterates over a parallel_while-style stream.

Providing an adaptor would work fine, of course.

MADadrobiso:
Another problem I have with enabling two forms for the input stream is that we are applying lipstick to a frog. The stream model is inherently sequential, and thus flawed for scalable parallel programming.* The "feeder" part of parallel_do is what makes it scalable. Making the non-scalableaspect of parallel_do easier to use seems like misplaced effort.

If you do away with the stream model for the other functions, then don't introduce it for parallel_do. If you don't intend to eliminate it elsewhere, then it seems that it should be usable with parallel_do.

Arch D. Robison (Intel)'s picture

The stream model is not used in any of the other parallel functions. It is only used in concurrent_queue and parallel_while.

MADadrobiso:The stream model is not used in any of the other parallel functions. It is only used in concurrent_queue and parallel_while.

I guess my ignorance is showing. Given that statement and your intention to deprecate parallel_while, I think using an adaptor to make an iterable stream and only having parallel_do take iterators is the appropriate course.

Login to leave a comment.