parallel_for is easier with lambdas, Intel Threading Building Blocks

Lambdas are an exciting new addition to C++ in the current draft for C++ 0x. (see my prior post for "Hello Lambda" - my introduction to Lambdas). The Intel compilers support them now in the Intel compiler products, and Microsoft has support in their beta for Visual Studio 2010. I think we can expect to see support for lambdas added quickly, and a great deal of interest in using them in C++ code.Lambdas quite simply allow code to be specified inline in ways that find particularly useful for parallel programming constructs - notably for Intel Threading Building Blocks (TBB).

Intel TBB supports forms of algorithms in the old style (without lambdas) and in the new style with lambdas.

To get an idea why I expect lambdas will be very popular with Intel TBB - we can look at the "with" and "without" syntax.

The "old" syntax before lambdas were available involved coding your "work to be done in parallel" into an operation() within a class. It was the toughest thing to teach about using Intel TBB - and it is THE reason C programmers complained about "C++ syntax" being needed with Intel TBB.

parallel_for(range, body, optional partitioner) w/out lambdas

#include "tbb/tbb.h"
using namespace tbb;
class ApplyFoo {
  float *const my_a;
  public:
    void operator()( const blocked_range<size_t>& r ) const {
      float *a = my_a;
      for( size_t i=r.begin(); i!=r.end(); ++i )
        Foo(a[i]);
    }
    ApplyFoo( float a[] ) :
      my_a(a) {}
  };
  void ParallelApplyFoo( float a[], size_t n ) {
  parallel_for(blocked_range<size_t>(0,n), ApplyFoo(a));
}

Writing the same program, but using lambdas - is actually quite readable:

parallel_for(first, last, step, function) with lambdas

#include "tbb/tbb.h"
using namespace tbb;
void ParallelApplyFoo( float* a, size_t n ) {
  parallel_for( blocked_range<size_t>(0,n),
    [=](const blocked_range<size_t>& r) {
      for(size_t i=r.begin(); i!=r.end(); ++i)
        Foo(a[i]);
      }
    );
}

Starting with Intel TBB 2.2, there is another form of parallel_for allowed too - and seems to be considered a little easier to read by some:

parallel_for(first, last, step, function) with lambdas

#include "tbb/tbb.h"
using namespace tbb;
void ParallelApplyFoo(float a[], size_t n) {
  parallel_for(size_t(0), n, size_t(1) , [=](size_t i) {Foo(a[i]);});
}

Final note - ridding yourself of compiler warnings.

There is a compiler warning that the above will produce currently with the Intel compiler - which you can just ignore, or it can be eliminated using:

//The pragma turns off warnings from the compiler about "use of a local type to declare a function".
#pragma warning( disable: 588)

Пожалуйста, обратитесь к странице Уведомление об оптимизации для более подробной информации относительно производительности и оптимизации в программных продуктах компании Intel.