Lambda not working

Lambda not working

Hi,
I've installed gcc version 4.4.2 on my Fedora system. I got the below code from this forum, but when I compile, I get:
g++ -ltbb -o lambda1 lambda1.cpp

lambda1.cpp: In function void par_ms(int, int, int*):
lambda1.cpp:43: error: expected primary-expression before [ token
lambda1.cpp:43: error: expected primary-expression before ] token
lambda1.cpp:44: error: expected primary-expression before [ token
lambda1.cpp:44: error: expected primary-expression before ] token

I thought lambda's were supported in the latest version of gcc. Please help.

#include 
#include 
#include 

#define N  9999999

using namespace std;
using namespace tbb;

void merge(int beg, int mid, int end, int *A)
{
vector tmp;
int i = beg;
int j = mid;

while ( ( i < mid ) && ( j < end ) )
{
if ( A[i] < A[j] ) {tmp.push_back( A[i] );i++;} else {tmp.push_back( A[j] );j++;}
}

while ( i < mid )
{
tmp.push_back( A[i] );
i++;
}

while ( j < end )
{
tmp.push_back( A[j] );
j++;
}

for ( int t = 0; t < (int) tmp.size(); t++ ) {A[ beg + t ] = tmp[t];}
}

void par_ms(int beg, int end, int *A)
{
if ( beg + 1 == end ) {return;}

int mid = beg + (end - beg)/2;

parallel_invoke(
[&](){ par_ms(beg, mid, A); },
[&](){ par_ms(mid, end, A); }
);

merge( beg, mid, end, A);

return;
}

int main()
{
task_scheduler_init init(-1);

int A[N];

for ( int i = 0; i < N; i++ ) {A[i] = N - i;}

par_ms(0, N, A);

for ( int i = 0; i < 10; i++ ) {cout << i << " " << A[i] << endl;}
for ( int i = N-10; i < N; i++ ) {cout << i << " " << A[i] << endl;}

return 0;
}//main

30 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Quoting - Nav
Hi,
I've installed gcc version 4.4.2 on my Fedora system. I got the below code from this forum, but when I compile, I get:
g++ -ltbb -o lambda1 lambda1.cpp

Supply -std=c++0x key
http://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting - Dmitriy Vyukov

Supply -std=c++0x key
http://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html

Thanks.
Tried g++ -std=c++0x -ltbb -o lambda1 lambda1.cpp

and got:

In file included from /home/username/TBB/tbbSource/include/tbb/_concurrent_queue_internal.h:37,
from /home/username/TBB/tbbSource/include/tbb/concurrent_queue.h:32,
from /home/username/TBB/tbbSource/include/tbb/tbb.h:47,
from lambda1.cpp:3:
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h:278: error: exception_ptr in namespace std does not name a type
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h:294: error: expected unqualified-id before & token
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h:294: error: expected ) before & token
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h:294: error: expected ; before & token
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h:295: error: expected ; before tbb_exception_ptr
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h: In member function void tbb::internal::tbb_exception_ptr::throw_self():
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h:291: error: rethrow_exception is not a member of std
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h:291: error: my_ptr was not declared in this scope
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h: In constructor tbb::internal::tbb_exception_ptr::tbb_exception_ptr(const tbb::captured_exception&):
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h:295: error: class tbb::internal::tbb_exception_ptr does not have any field named my_ptr
/home/username/TBB/tbbSource/include/tbb/tbb_exception.h:295: error: copy_exception is not a member of std
lambda1.cpp: In function void par_ms(int, int, int*):
lambda1.cpp:43: error: expected primary-expression before [ token
lambda1.cpp:43: error: expected primary-expression before ] token
lambda1.cpp:44: error: expected primary-expression before [ token
lambda1.cpp:44: error: expected primary-expression before ] token

Quoting - Nav

Quoting - Dmitriy Vyukov

Supply -std=c++0x key
http://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html

Thanks.
Tried g++ -std=c++0x -ltbb -o lambda1 lambda1.cpp

and got:

Humm... well, following table suggests that gcc 4.4 supports "Propogating exceptions" C++0x feature:
http://gcc.gnu.org/projects/cxx0x.html

I dunno. You may try to supply -std=gnu++0x
or update to the most latest gcc
or just wait for comments from TBB team

Btw, what documentation says on lambda support? what compilers are supported?

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting - Dmitriy Vyukov

Humm... well, following table suggests that gcc 4.4 supports "Propogating exceptions" C++0x feature:
http://gcc.gnu.org/projects/cxx0x.html

I dunno. You may try to supply -std=gnu++0x
or update to the most latest gcc
or just wait for comments from TBB team

Btw, what documentation says on lambda support? what compilers are supported?

Tried with -std=gnu++0x
Same errors showed up.

The version of gcc I have is the latest. I just configured, built and installed it myself a few days back. Version 4.4.2.
The gcc 4.4.2 documentation does not even have the word 'lambda' in it.
But basically, gcc 4.4.x series is supposed to support lambda.

"But basically, gcc 4.4.x series is supposed to support lambda."

Really?

Quoting - Raf Schietekat

"But basically, gcc 4.4.x series is supposed to support lambda."

Really?

Yes, that's what a person at LinuxQuestions.org had to say about it.
Do you mean to say that only version 4.5 of gcc supports lambda? I saw that page earlier too, but there appears to be no solid evidence which says anything about lambda support.
Besides, the current stable release of gcc is 4.4.2. http://gcc.gnu.org/

So how do I get the program to work? Anyone?

Quoting - Nav

Yes, that's what a person at LinuxQuestions.org had to say about it.
Do you mean to say that only version 4.5 of gcc supports lambda? I saw that page earlier too, but there appears to be no solid evidence which says anything about lambda support.
Besides, the current stable release of gcc is 4.4.2. http://gcc.gnu.org/

So how do I get the program to work? Anyone?

Switch to the Intel V11 compiler, which does support lambdas? :-) (Just a thought!) (Or at least switch to the Intel Compiler forum, where you might hit more people with gcc expertise?)

Quoting - Nav
So how do I get the program to work? Anyone?

Rewrite to use explicit function objects (quite tedious but should work).
Maybe argument binding can help, though I don't know how to do it right.

@Alexy and Robert:
Thanks, but using gcc is a 'have to' for me right now.
I need to see lambdas working for me on gcc.
Will be posting in the Intel Compiler forum. If anyone here knows how to solve the problem, then please help out coz I'll be referring this thread too.
If I get any pointers I'll contribute back to this thread.

"I need to see lambdas working for me on gcc."
Can you explain that? For TBB, lambdas are just sugar (you don't need them), so how is this going to affect your decision about TBB if the decision to use g++ has already been made?

Quoting - Raf Schietekat
"I need to see lambdas working for me on gcc."
Can you explain that? For TBB, lambdas are just sugar (you don't need them), so how is this going to affect your decision about TBB if the decision to use g++ has already been made?

Dear Raf, let me assure you that I'm not playing a game here. There are certain constraints I'm working with, and would be really grateful if the lambda problem is answered.

"Dear Raf, let me assure you that I'm not playing a game here. There are certain constraints I'm working with, and would be really grateful if the lambda problem is answered."
You can both motivate and help us to help you by providing an answer to my question.

Quoting - Raf Schietekat
"Dear Raf, let me assure you that I'm not playing a game here. There are certain constraints I'm working with, and would be really grateful if the lambda problem is answered."
You can both motivate and help us to help you by providing an answer to my question.

Okay, I trust you have a reason for asking so:

Taking the reference of one of my previous posts:
http://software.intel.com/en-us/forums/showthread.php?t=70511

Firstly, I was very surprised that the parallel_for was made in such a way that it always required a function object (the obvious comparison which went with OpenMP, that if it's just a simple pragma there, then why not something simple here too).
This brings forward the coding complexity and amount of code that has to be written for developing a large application. Obviously the use of lambda's is going to simplify and reduce the time taken for me to do my work.

When Intel says that the amount of coding to be done using TBB is much lesser, they were comparing it with ordinary threading. Apparently, the amount of coding would be lesser than or equal to, if I use OpenMP.

Secondly, there are plenty of programmers who need to work with gcc. Learning complex TBB would not be much of an incentive to a programmer if lambda's (or any other feature that simplifies TBB) did not work in gcc (am mentioning this knowing that lambda's are feature of c++0x and not TBB).

Thirdly, I'm well aware that Intel encourages the use of OpenMP for specific type of programming needs. But for now, I need to use lambdas in place of function objects in my TBB program which has to be compiled with gcc 4.4.2. Could anyone help out?

Quoting - Nav
@Alexy and Robert:
Thanks, but using gcc is a 'have to' for me right now.
I need to see lambdas working for me on gcc.

As Raf said, lambdas are just syntax sugar for more convenient use of TBB. Lambda is just an unnamed functor class written by the compiler. You can achieve the same effectby writing an explicit function class. The arguments passed to the function should be captured by the instance of the class (i.e. passed to its constructor and saved in the instance, either by value or by reference). The class should have a operator() method with no parameters; inside that method, you call the function passing the arguments captured earlier.

If for some reason you have to use both g++ and lambdas, then you have to wait for GCC 4.5.

Quoting - Nav
Firstly, I was very surprised that the parallel_for was made in such a way that it always required a function object (the obvious comparison which went with OpenMP, that if it's just a simple pragma there, then why not something simple here too).
This brings forward the coding complexity and amount of code that has to be written for developing a large application. Obviously the use of lambda's is going to simplify and reduce the time taken for me to do my work.

When Intel says that the amount of coding to be done using TBB is much lesser, they were comparing it with ordinary threading. Apparently, the amount of coding would be lesser than or equal to, if I use OpenMP.

Secondly, there are plenty of programmers who need to work with gcc. Learning complex TBB would not be much of an incentive to a programmer if lambda's (or any other feature that simplifies TBB) did not work in gcc (am mentioning this knowing that lambda's are feature of c++0x and not TBB).

Using function objects is the usual practice for template C++ libraries such as STL and TBB. I doubt parallel_for could be written in a generic way without using a function object. Unfortunately, with a pure library we could not achieve the same simplicity that is possible with compiler support (e.g. in OpenMP or Cilk++). So you are right that TBB requires more coding than OpenMP. Meanwhile, TBB is not positioned as OpenMP replacement; if the latter works for you, then just use it.

I agree with the rest of what you said. In fact, lambda support in Intel Compiler was in a great deal driven by the desire to simplify use ofTBB.

So TBB does support GCC and does work with lambdas; but GCC needs to understand lambdas before you can start using both (no matter with TBB or not).

#13 "Okay, I trust you have a reason for asking so:"
Thanks for the extra information.

I like that TBB works with currently-standard C++, but it would indeed be nice to be able to dispense with some of the boilerplate, although, if I understand correctly, it's not "need to" so much as "want to". What alternatives are you contemplating?

Perhaps you could write the code both without and with lambdas, and use a preprocessor switch to activate one or the other. When the time comes to throw the switch, it will be easier to only have to verify that the program still works, and then you can remove the old version if desired.

This may look like a lot of overhead if you zoom in on it, but Amdahl (reversed) sometimes also applies to programming effort: writing function objects is a fairly straightforward translation from lambda's, with roughly linear cost over part of the code, and then there's still testing etc., so it may not amount to that much in the big picture, as opposed to something like rearchitecting. Your decision, of course. :-)

Note that Microsoft, in their design of their PPL library for parallel programming, came to the same conclusion that function objects are fundamental to writing a parallel programming library for C++. VS 2010 will support the lambda syntactic sugar. The alternative would have been pragmas like OpenMP or keywords like Cilk. But those would require special compilers, and one of TBB's goals is portability. (On the other hand, Cilk can do some powerful stuff with its compiler support.)

In another year, you'll wonder how you ever got along without lambdas.

Quoting - Raf Schietekat

writing function objects is a fairly straightforward translation from lambda's, with roughly linear cost over part of the code, and then there's still testing etc.,

Yes, roughly linear, though the linear scale factor may vary depending on the local context around the kernel. I personally hope I never have to use the non-lambda approach again (not realistic in my job), having had my fill of it with pre-lambda compilers. The biggest challenge comes when the kernel you're trying to run parallel has a lot of local context (variables it mostly needs to read during execution) because making those variables available requires an explicit linkage, such as through a class initialization function that can have a hairy number of arguments you'll need to get right in both the declaration and the call. With the lambda construct, most of this can be handled with the simple [&] context syntax in the lambda definition. But as Alexey has said repeatedly,there's nothing you can do with lambdas functionally that you cannot do with an explicit function object class.

Robert,

While I agree that Lambda's ease the programming it does come at an expense. From my (limited) experience Lambda's have two issues:

a) theyhave a little higher overhead than an explicit functor with args or ->context, therefor the body of the Lambda function must perform more work in order for the extra cost to be amortized.

b) When using [&] and when using objects with reference counters you cannot turn off the IncReferences()/DecReferences() meaning these must nowinclude locks (runs slower). Pointers passed from outside the scope of theLambda to inside the scope of thelambdamight be more suitable. IOW create the additional reference(s) and pointers to these references outside the scope of the parallel_xxx with [&]Lambda and using pointer inside the Lambda functon.

Jim Dempsey

www.quickthreadprogramming.com

@Alexey:
Okay, so it's either gcc 4.5 (sometime in the future) or the trial version of Intel C++ compiler that I can try out for lambdas.

@Raf:
"if I understand correctly, it's not "need to" so much as "want to". What alternatives are you contemplating?"
I appreciate the help, but I wouldn't want to go into details.

Were you trying to say that "lambda" is almost the reverse of "amdahl"? That's cool :)
Yes, the cost perspective is also presented by Jim. There are other factors I'd mention too, but that's out of the scope of this discussion.

@Arch:
"In another year, you'll wonder how you ever got along without lambdas"
So true :)
I had read the part about how function objects are required when new threads need to use a copy constructor. It's interesting to hear about the Microsoft perspective too.

Quoting - jimdempseyatthecove

Robert,

While I agree that Lambda's ease the programming it does come at an expense. From my (limited) experience Lambda's have two issues:

a) theyhave a little higher overhead than an explicit functor with args or ->context, therefor the body of the Lambda function must perform more work in order for the extra cost to be amortized.

b) When using [&] and when using objects with reference counters you cannot turn off the IncReferences()/DecReferences() meaning these must nowinclude locks (runs slower). Pointers passed from outside the scope of theLambda to inside the scope of thelambdamight be more suitable. IOW create the additional reference(s) and pointers to these references outside the scope of the parallel_xxx with [&]Lambda and using pointer inside the Lambda functon.

Jim Dempsey

I'm a bit surprised because I thought that the lambda's would be converted into something like "inline code" at compile time. But that was just an assumption. It's interesting to hear your viewpoint.
It'd be great to know how much percent of an increase there is in the overhead.

Quoting - jimdempseyatthecove
From my (limited) experience Lambda's have two issues:

a) theyhave a little higher overhead than an explicit functor with args or ->context, therefor the body of the Lambda function must perform more work in order for the extra cost to be amortized.

b) When using [&] and when using objects with reference counters you cannot turn off the IncReferences()/DecReferences() meaning these must nowinclude locks (runs slower). Pointers passed from outside the scope of theLambda to inside the scope of thelambdamight be more suitable. IOW create the additional reference(s) and pointers to these references outside the scope of the parallel_xxx with [&]Lambda and using pointer inside the Lambda functon.

An interesting observation, Jim. This is the first I've heard of such anissue. Sounds like something we should gather evidence for, or at least independently verify. Unfortunately at the moment, my blog is on a bit of a hiatus as I work through some strange compiler optimization anomalies I encountered back in December, working with the compiler team, and I have a couple other high priority tasks that will keep me from trying anything for at least a week or two. Has anyone else observed this phenomenon?

To Nav I'd reply that lambdas are a little more complicated than you might suspect. Implementing them just as inline code would not provide the flexibilty to allow independent threads to call them (need an entry point for that). The context (the [&] stuff) enables dynamic binding of local independent variables for the associated function and the result represents a full closure, though there might be some functional programming purists in the audience who might dispute that.

#18 "Yes, roughly linear, though the linear scale factor may vary depending on the local context around the kernel."
How much would you typically capture (average, extreme), and what would you think is typical in general? In your opinion, do the Amdahl parameters indicate an acceptable temporary solution if g++ is required?

Quoting - Raf Schietekat
#18 "Yes, roughly linear, though the linear scale factor may vary depending on the local context around the kernel."
How much would you typically capture (average, extreme), and what would you think is typical in general? In your opinion, do the Amdahl parameters indicate an acceptable temporary solution if g++ is required?

In the conversion case I'm thinking of, there were particular parts of the solver that might have a dozen or more arguments for the class constructor in order to provide the linkages for extracting the kernel. They included arrays and objects that had member functions used in the kernel. I think if I was designing the code with a separate function object in mind, I could do it without so many wiggly ends to reconnect, but in the case of converting an existing serial solver to make use of such a parallel implementation with a subgoal of disrupting the original code as little as possible, the hairy ends were left for all to see. Now, Jim's observations about lambda slowdowns being a caution that I hope proves not an issue, I'd just use lambda context and dispense with the data linkage management process altogether.

I don't think the Amdahl parameters apply much to the parallel conversion process unless you can dole out the various serial kernels to a coterie of developers to get data conversion scaling ;-). For the sole converter, it would still be a process of taking them on one at a time, your basic n * m, or maybe n * E(i)n where n is the number of kernels to convert and E(i)n is the "embeddedness" or complexity of the kernel in situ, in other words, some measure of the extraction complexity for a particular kernel in question.

"a dozen or more arguments"
Ouch!

"I don't think the Amdahl parameters apply much to the parallel conversion process unless you can dole out the various serial kernels to a coterie of developers to get data conversion scaling ;-)."
The speed-up/slow-down formula doesn't just apply to parallelism, of course. But my real question was whether my suggestion seemed realistic to you. Perhaps not, with a dozen or more arguments required again and again and again?

Quoting - Raf Schietekat
"I don't think the Amdahl parameters apply much to the parallel conversion process unless you can dole out the various serial kernels to a coterie of developers to get data conversion scaling ;-)."
The speed-up/slow-down formula doesn't just apply to parallelism, of course. But my real question was whether my suggestion seemed realistic to you. Perhaps not, with a dozen or more arguments required again and again and again?

No, not again and again and again. The linkages would be established for each kernel with a specialized constructor call that initialized "holders" for the components in the class and the number required varied with the requirements of the kernel. Replication of instances of this class happened behind the scenes as threads scheduled to share the work made copies of the object.

Is it realistic to use explicit functors if you don't have access to lambdas? Yes. I've done it before and I'm likely to do it again as I work with TBB in areas that may not have lambda support yet (like with constraints to use g++ pre-4.5). And the difficulty level of doing that will be determined by the context complexity of each of the kernels.

There was some uncertainty expressed earlier on whether even gcc 4.5 supported lambdas. David Raila reported to me that lambda expressions do indeed work with gcc 4.5. Below are his notes.

[raila@upcrc-win01 llvm]$ gcc-4.5 -v
Using built-in specs.
COLLECT_GCC=gcc-4.5
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ./configure --program-suffix=-4.5 --disable-libgcj --enable-languages=c++ : (reconfigured) 
Thread model: posix
gcc version 4.5.0 20100105 (experimental) (GCC) 

Compile with:  gcc-4.5 -std=c++0x


#include 

#include "tbb/parallel_for.h"
#include "tbb/blocked_range.h"

using namespace tbb;

template 
void
doForeach( F f, int start, int end)
{
  for (int i=start; i < end; i++)
    f(i);
}

//
// lambda f
//
auto f = [](int i){printf("Hello3 Lambdas %dn", i); };

int main() {
        // serial:
        doForeach(f, 0, 10);
        // parallel
        parallel_for(0, 10, f);

        return 0;
}

Yay! Thanks for posting!

The test is working with gcc. You use the following command.

$gcc -c -std=c++0x tstcase.cpp

$ gcc -v

Using built-in specs.

COLLECT_GCC=/usr/bin/gcc

COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.5.1/lto-wrapper

Target: x86_64-redhat-linux

Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,lto --enable-plugin --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux

Thread model: posix

gcc version 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC)

Leave a Comment

Please sign in to add a comment. Not a member? Join today