bug in concurrent_bounded_queue?

89 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Quoting Dmitriy Vyukov

Quoting Andrey Marochko (Intel)
Herlihy writes that "each operation should appear to take effect instantaneously", not that a single store (atomic at hardware level) must be used to implement linearizability.

Yes, but "should *appear* to take effect instantaneously" means that it should be indistinguishable in any externally observable sense from implementation that uses single atomic store. Which is not the case for algorithms that touch object's data after lineariataion point (TBB queue, M&S queue).

I think you are reading way more into the definition of linearizability than what is there. If what you say is true, then many of the algorithms out there that claim linearizability, are not. Because the exact linearization point itself is *not always externally observable* you can only guarantee that the point is passed after the operation responds. Only after that is it safe to call a destructor. (In the example, we know we have passed the linearization point, but that still does not imply "safe to delete container", it only implies, "the data is ready".) This discussion is mixing the very specific and strong property of linearizability (which deals with the behavior of the container -- i.e. when are the effects of operations seen by other operations) with general thread safety. Destructor is not a linearizable operation. Linearizability doesn't apply here.

-Terry

Maybe I missed it from this suddenly very intense discussion, but do we know for sure that a queue can be written without any kind of performance penalty whatsoever that still allows destruction right after receiving the last message? And even so, how important is it to emulate that aspect of a mutex-protected queue?

(Added) Or the other way around: do we know that such a queue can't be written?

Quoting Raf Schietekat
And even so, how important is it to emulate that aspect of a mutex-protected queue?Personally, I don't think it's very important. I just found the implementation very opaque in this regard.

Mike

Quoting Raf Schietekat
Maybe I missed it from this suddenly very intense discussion, but do we know for sure that a queue can be written without any kind of performance penalty whatsoever that still allows destruction right after receiving the last message?

Perhaps it's possible to modify just destructor of tbb queue so that it will wait for outstanding (partially commited) pushes and pops (passively spin until some counters reach necessary values). That will have basically zero overhead on performance.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

Quoting Raf Schietekat
And even so, how important is it to emulate that aspect of a mutex-protected queue?

If I would be porting a large existing application which constantly creates and destroys queues from mutex-based queues to concurrent queues, I would not mind such a support.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

Quoting Dmitriy Vyukov

Quoting Raf Schietekat
Maybe I missed it from this suddenly very intense discussion, but do we know for sure that a queue can be written without any kind of performance penalty whatsoever that still allows destruction right after receiving the last message?

Perhaps it's possible to modify just destructor of tbb queue so that it will wait for outstanding (partially commited) pushes and pops (passively spin until some counters reach necessary values). That will have basically zero overhead on performance.

With any other type of global data, it's the user's responsibility to not delete something that may be in use by another thread. Why is this case different?

I'd like to thank all fora good and useful discussion. At the very least, it showed to me that understanding of linearizability inside the TBB team at Intel appears to be consistent :) Below is what me, Andrey, and Terry said:
"linearizability is about a consistent behavior of the agreed upon set of functions ... These functions,when executed concurrently,should agree with each other about what they do with the object, in a predictable manner. Observations are only done by calling these functions and analyzing what they returned."
"linearizability is a theoretical abstraction <...> One can define the scope of applicability of the theoretical concept to a practical implementation, including both operations set and observable effects."
"linearizability ... deals with the behavior of the container -- i.e. when are the effects of operations seen by other operations"
Of course I do not pretend this to be the absolute truth; but knowing that we understand it the same way is great. And all three of us seem to think that safe object destruction is unrelated to linearizability of other operations with the object.

Trying to identify some actionable points out of the discussion, I think we should
- clarify in the documentation thatcontainer destruction is not thread safe, and should only be called after all other operations are known to complete (and not just observed to take effect);
- analyze whether concurrent_queue's operations that push or pop data are linearizable (I think they are, but I did not yet think of the proof);
- document whatever behavior guarantees we want to provide for the container.

Quoting Terry Wilmarth (Intel)

With any other type of global data, it's the user's responsibility to not delete something that may be in use by another thread. Why is this case different?

Just because if data structures that you have do not feature such a property (safe deletion) then you won't be able to delete ANY shared objects during program run time.

Consider that you need to delete a concurrent queue object (and it does not feature such a property). You can create separate helper reference counter to determine when it's safe to delete the queue. You determine when it's safe to delete the queue, and actually delete it. What now? Now you need to delete the helper object. How to do that? You can create third object to determine when it's safe to delete second object. Then fourth, and so on.

More precisely, number of shared objects will be non-decreasing function. Even if your helper objects occupy just one machine word, you will run out of memory quite fast.

You may say that you will join some threads and then delete all the memory that they used (the fact that some programs just do not periodically join threads during run time aside for a moment). Thread is a shared object too. So you still need to determine when it's safe to release all the resources occupied by a thread. If it does not feature safe deletion property, you need to create one more helper object, and we back from where we started.

Now let's start from the beginning.
Concurrent programs need to delete objects during run time. Period.
So at least some objects HAVE to have safe deletion property. Period.
I.e. you need to determine when it's safe to delete an object basing you reasoning only on values returned from methods of the object itself (w/o using external objects).

Now when you can delete some objects we indeed may allow some other objects to not feature the property (they will be a kind of dependent, and always require some external assistance). But on the other hand it's at least nice if as much as possible objects will feature the property, because, well, we need to delete every object at some point in time.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

Surely Terry Wilmarth didn't mean that all objects need a helper, with smart pointers having been mentioned quite a few times already. Would you also want to be able to delete the queue after seeing evidence of N items having been pushed if you know that only N items would be coming, or would the prefix in "unsafe_size()" dissuade you in that case? Would you expect try_pop() at least to be successful? If not, why not, and if so, are you sure that there's no penalty involved, because that would seem to require flushing the pipeline before reporting a size? If it's easy to make a size-reporting function that's conservative in the sense of excluding "concurrent modifications in flight", why wouldn't concurrent_queue have one? Just wondering...

(Added) Basically, if deleting a queue based on a popped item is an important property, why not have close() and atend() instead?

I was on vacation and am just catching up on this discussion.

As a practical matter, did anyone check whether concurrent_queue would work in this use case? I think that it might, but have not had time to check. From a quick look, it appears that concurrent_queue::push does not touch *this after the pushed item becomes visible to the consumer.

The difficulty in the implementation of concurrent_bounded_queue::push is that it must signal the a waiting consumer thread to wake up, if there is such a waiter. The consumer is signaled after the point that the item became available to the consumer. Thus the signaling mechaniusm must not be destroyed prematurely. Ironically, the given use case never has a waiting consumer since it is using try_pop, not pop, and actually does not need the wakeup action.

Quoting Arch Robison (Intel)
As a practical matter, did anyone check whether concurrent_queue would work in this use case? I think that it might, but have not had time to check. From a quick look, it appears that concurrent_queue::push does not touch *this after the pushed item becomes visible to the consumer.

The difficulty in the implementation of concurrent_bounded_queue::push is that it must signal the a waiting consumer thread to wake up, if there is such a waiter. The consumer is signaled after the point that the item became available to the consumer. Thus the signaling mechaniusm must not be destroyed prematurely.

I came to the same conclusion.

Btw, one of the possible ways to support the use case is to use something along the line of hazard pointers. I.e. move actual data into dynamically allocated object, and prolong it's lifetime if necessary.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

Quoting Arch Robison (Intel)
Thus the signaling mechaniusm must not be destroyed prematurely. Ironically, the given use case never has a waiting consumer since it is using try_pop, not pop, and actually does not need the wakeup action.

This is an interesting thought; though I'm not sure that's true in this case. It looks like the original example uses pop() where it appears to crash, and the queue with the try_pop() call is never explicitly deleted.

Mike

Quoting Dmitriy VyukovNow let's start from the beginning.
Concurrent programs need to delete objects during run time. Period.
So at least some objects HAVE to have safe deletion property. Period.
I.e. you need to determine when it's safe to delete an object basing you reasoning only on values returned from methods of the object itself (w/o using external objects).

Now when you can delete some objects we indeed may allow some other objects to not feature the property (they will be a kind of dependent, and always require some external assistance). But on the other hand it's at least nice if as much as possible objects will feature the property, because, well, we need to delete every object at some point in time.

Fortunately, there are kinds of objects that do have "safe deletion" property - local objects (owned by a single threadat any giventime) and static file-scope objects destroyed at program exit. Together with certain programming discipline, it allows building safe deletion protocols for shared objects.
For example, let's imagine a template class that holds a pointer to the object that one needs to share, also contains a reference counter, and implements smart pointers to access itself. All you need then is the programming discipline. The smart pointers should be local objects never passed by reference, so that they are safe to delete. And the object-to-share should never be used unless the thread holds such a pointer. When the last such pointer is destroyed, and so thereference counter becomes zero, both the object-to-share and the reference counting object can be destroyed.

Quoting Alexey Kukanov (Intel)

Fortunately, there are kinds of objects that do have "safe deletion" property - local objects (owned by a single threadat any giventime) and static file-scope objects destroyed at program exit. Together with certain programming discipline, it allows building safe deletion protocols for shared objects.

Nope, it's not.

Destruction of non-shared object is trivial and irrelevant.

Smart pointers aside, they are no more than 'a syntactic sugar', I am perfectly able to call 'acquire' and 'release' manually.

So what we have is a shared object with 'acquire' and 'release' methods. One of calls to 'release' returns 'true', which means it's safe to delete the object now. So it's actually a shared object that must support 'safe deletion'. You can't bolt it on with any kind of smart pointers and other external things.

For example, let's imagine a template class that holds a pointer to
the object that one needs to share, also contains a reference counter,
and implements smart pointers to access itself. All you need then is the
programming discipline. The smart pointers should be local objects
never passed by reference, so that they are safe to delete. And the
object-to-share should never be used unless the thread holds such a
pointer. When the last such pointer is destroyed, and so thereference
counter becomes zero, both the object-to-share and the reference
counting object can be destroyed.

Ok, let's see how a programming disciple will help you. Let's assume that the shared object is TBB concurrent queue with initial reference count of 11 (10 items in the queue). We have 11 threads, and each calls try_pop() once. We use smart pointer (RAII) to ensure that each thread will indeed call try_pop(). When the reference counter drops to zero (try_pop() returns 0), the thread deletes the queue.

Quiz: will smart pointers help you?

See, it has nothing to do with smart pointers and the like. It's the shared object that must be implemented in such a way that *IT* will say to you when it's safe to delete it. If your shared objects do not have such a feature, there is nothing you can do to reduce number of alive objects.


All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

Note that I'm not saying that each and every object must support 'safe deletion', I'm only saying that:
(1) It's a useful property. I don't care how scientists define theoretical linearizability, but I write real-world programs and I need to know when it's safe to delete a particular object. So 'touching object's data' it's an important and observable side effect for me.
(2) Guarantees in this respect must be (at least for widely used support library) clearly documented. I.e. I need to know as to whether each particular method touches an object's data after 'partial linearization point' with each other method. For example, enqueue() may touch data after linearization point with dequeue(), but do not touch data after linearization point with size() (i.e. method by means of which I observe commit of some other method can make difference). In particular, "no guarantees in this respect" is also a clear documentation, this just means that one must always use some external means for that (in particular, sane mutexes, threads, reference counters and safe memory reclamation algorithms are always implemented with "safe deletion" in mind, thus can be used to safely delete other objects).

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

I concur with both of Dmitriy's points. I suggest that the default be "no guarantees" and we clearly document the cases where it is guaranteed. It would be good to prove that the TBB mutexes have the "safe deletion" guarantee (it's not immediately obvious with some of the complicated mutexes).

Quoting Dmitriy VyukovOk, let's see how a programming disciple will help you. Let's assume that the shared object is TBB concurrent queue with initial reference count of 11 (10 items in the queue). We have 11 threads, and each calls try_pop() once. We use smart pointer (RAII) to ensure that each thread will indeed call try_pop(). When the reference counter drops to zero (try_pop() returns 0), the thread deletes the queue.

Dmitry, I am sorry to say that but you likely did not get the idea again.

I did not say that counted references are kept as items in the queue, or that there are as many references as items in the queue. I said that each thread, as long as it is going to use the queue, keeps a counted reference all along, and only releases it when done with the queue.
Let's take your earlier example with producer/consumer. The producer pushes the completion signalto the queue, and thenremoves the reference. The consumer pops the signal, and removes its reference. If you have 11 threads, each should have its own reference before using the queue in any way, and remove it when done. Whoever is last, deletes the queue.

Quoting Dmitriy VyukovIt's the shared object that must be implemented in such a way that *IT* will say to you when it's safe to delete it. If your shared objects do not have such a feature, there is nothing you can do to reduce number of alive objects.

With the whole memory under the object being reclaimed (including any state that tells about safety of deletion), no object can say when it's safe to delete it because it can not predict the time it will be accessed again, and there always exists timing window between the check and the destruction when some other thread can tryaccessing the object. So it is the question of programming discipline at the application level, not of class/object level safety techniques.

I thought more of the components of the solution I suggested above. The template reference counting class is convenient just to separate the concept of counting and apply it to an arbitrary class. Smart pointers are syntactic sugar, with the following key idea underneath: a thread only increments the reference count ofa shared object when it already holds a counted reference to that object. Smart pointers just make it more convenient to follow this principle. The thread can then pass the reference to some other thread.

It's like an invitation-only club: if you are a member of that club you can invite other members, but people from street can't enter; so whoever is the last member leaving the club, destroys it. Indeed the destruction is safe, because the last member knows nobody can be inside :)

And the next thought is that the only difference related to concurrent programming is that reference counting should be done with atomic RMW instructions. Otherwise, it's the same old story of shared ownership.

"From a quick look, it appears that concurrent_queue::push does not touch *this after the pushed item becomes visible to the consumer."
It touches my_rep to notify the consumer. And even if the consumer received the notification before it saw the item, do we know whether the notification mechanism by itself can be "safely" deleted?

"The consumer is signaled after the point that the item became available to the consumer."
What would be the effect, including on performance, of inverting that order, making the consumer busy-wait for the item after receiving the notification, with or without the occasional lost quantum when the producer goes to
sleep at exactly the wrong time, if this reordering is at all possible of course, and otherwise why not?

Quoting Alexey Kukanov (Intel)

Dmitry, I am sorry to say that but you likely did not get the idea again.

I did not say that counted references are kept as items in the queue, or that there are as many references as items in the queue. I said that each thread, as long as it is going to use the queue, keeps a counted reference all along, and only releases it when done with the queue.
Let's take your earlier example with producer/consumer. The producer pushes the completion signalto the queue, and thenremoves the reference. The consumer pops the signal, and removes its reference. If you have 11 threads, each should have its own reference before using the queue in any way, and remove it when done. Whoever is last, deletes the queue.


I was talking about an implementation of the reference counter itself.

I am able to implement reference counter by means of a quque, so that it works as you described. Or I am able to implement it by means of queue, so that it won't work as you described (tbb queue as reference counter).

I am able implement ference counter by mean of, well, reference counter, so that it works as you described. Or so that it won't work as you described. Everything depends on implemenetation of the shared object. For example consider:

bool release_reference(object_t* obj)

{

int rc = obj->rc.fetch_sub(1);

obj->last_accessed.store(get_timestamp());

return rc == 1

}

If your objects support 'safe deletion', then you are able to delete them. If they are not, then you can't. If reference counter object's methods touch own data after linearization point, then your protocol won't help you.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

Quoting Alexey Kukanov (Intel)

With the whole memory under the object being reclaimed (including any state that tells about safety of deletion), no object can say when it's safe to delete it because it can not predict the time it will be accessed again, and there always exists timing window between the check and the destruction when some other thread can tryaccessing the object. So it is the question of programming discipline at the application level, not of class/object level safety techniques.

I see what you mean. But I would describe it as follows.

Accessing reference counted object only after acquiring a reference it's just a matter of satisfying a contract between the component and outer code. If the contract is not satisfied then, of course, all bets are off (it's like accessing an array out of bounds). But if it is satisfied, then one does not automatically get everything in this world, one gets only what the component promises. In particular, if reference counted object does not promise it's safe deletion (thought, it's as correct and as linearizable as tbb queue), you won't get it even if you follow the contract.

And my point is that some components MUST promise safe deletion. And if they are not, sorry, you can't get it with any programming discipline.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

"It's a useful property ... I write real-world programs and I need to know when
it's safe to delete a particular object.
"

I would argue that the original scanario, where queue emptiness is used as a termination signal, is a just a corner case. Normally in a concurrent setup a container being empty guarantees nothing, as there easily can be another bunch of push/pop pairs in flight in other threads. Besides, most of real world applications operate with multiple objects, and making one of them moonlighting as an execution flow controller is unsound from the design stability standpoint.

Thus even though it is indeed a useful property in general, it's applicability area is narrow enough to sacrifice it without any qualm in favor of better performance and scalability of the primary conatiner operations (insertion, deletion, lookup, etc...)

Quoting Alexey Kukanov (Intel)

It's like an invitation-only club: if you are a member of that club you can invite other members, but people from street can't enter; so whoever is the last member leaving the club, destroys it. Indeed the destruction is safe, because the last member knows nobody can be inside :)

It seems that you are talking about reference counters with only basic thread safety. Strongly thread safe reference counters do allow people from street to enter.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

Quoting Andrey Marochko (Intel)
I would argue that the original scanario, where queue emptiness is used as a termination signal, is a just a corner case. Normally in a concurrent setup a container being empty guarantees nothing, as there easily can be another bunch of push/pop pairs in flight in other threads. Besides, most of real world applications operate with multiple objects, and making one of them moonlighting as an execution flow controller is unsound from the design stability standpoint.

Your opinion disagress with opinion of designers of POSIX standard. Why would they provide socket shutdown function then? It's basically the same - producer may send an END message, after receiving of which consumer may close the socket, because he knows that nobody uses it anymore.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

Quoting Andrey Marochko (Intel)

Thus even though it is indeed a useful property in general, it's applicability area is narrow enough to sacrifice it without any qualm in favor of better performance and scalability of the primary conatiner operations (insertion, deletion, lookup, etc...)

And... what for are you going to sacrifice performance/scalability of insert/remove when implementing it?

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

Quoting Dmitriy Vyukov

Quoting Andrey Marochko (Intel)

Thus even though it is indeed a useful property in general, it's applicability area is narrow enough to sacrifice it without any qualm in favor of better performance and scalability of the primary conatiner operations (insertion, deletion, lookup, etc...)

And... what for are you going to sacrifice performance/scalability of insert/remove when implementing it?

There is always also an implementation complexity of course. However that's a different matter. If you decide to not penalize performance, each and every end user benefits. If you decide to not increase complexity, personally you benefit :)

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

Quoting Dmitriy VyukovI see what you mean. But I would describe it as follows.

Accessing reference counted object only after acquiring a reference it's just a matter of satisfying a contract between the component and outer code. If the contract is not satisfied then, of course, all bets are off (it's like accessing an array out of bounds). But if it is satisfied, then one does not automatically get everything in this world, one gets only what the component promises. In particular, if reference counted object does not promise it's safe deletion (thought, it's as correct and as linearizable as tbb queue), you won't get it even if you follow the contract.

And my point is that some components MUST promise safe deletion. And if they are not, sorry, you can't get it with any programming discipline.

Ok I think we are inconsensus, or at leastclose enough (finally).

I agree that for the implementation of reference counters (and generally objects that control lifetime of itself or other objects), a stricter guarantee than linearizability is required. Fortunately, the very straightforward implementation of reference counters with atomic RMW instructions on a single machine word satisfies the required property. That's what I always had in mind, and so getting to the point of mutual understanding took longer. Anyway I'm glad that happened :)

As for the "invitation club" scheme, indeed this is just one possible way to design the high-level discipline (or contract if you wish); there are hazard pointers, andother schemas as well, garbage collectors, after all. I just wanted to show that there is a simple and working solution to the problem, but it is not 100% technical.And generally the solution can not be 100% doneat the library side. Well, maybe unless it's a library for a garbage collected environment, with corresponding support from both language and runtime.

"From a quick look, it appears that concurrent_queue::push does not touch *this after the pushed item becomes visible to the consumer."

"It touches my_rep to notify the consumer. And even if the consumer received the notification before it saw the item, do we know whether the notification mechanism by itself can be "safely" deleted?"

I'm fairly certain that concurrent_queue::push does not touch my_rep after the item becomes visible to the consumer. A pushed item does not become visible until method push executes the atomic fetch-and-add "tail_counter+=..." inside the representation, after which it does not touch the representation. That fetch-and-add is the notification mechanism. So it is safe for the consumer to delete it.

As an informal check, I modified the original example to use concurrent_queue instead of concurrent_bounded_queue. It's been running for a 100,000,000 iterations without complaint. Unfortunately, I had to replace the "pop" in the original with a busy wait loop around "try_pop", so it lacks some of the desirable behavior of the original example.

What makes concurrent_bounded_queue different and thus break in the same scenario is that it has an additional wakeup mechanism after the update of the tail_counter.

"Your opinion disagress with opinion of designers of POSIX standard.
Why would they provide socket shutdown function then? It's basically the
same - producer may send an END message, after receiving of which
consumer may close the socket, because he knows that nobody uses it
anymore.
"
Well, design approaches to shared memory and distributed applications are a little different, aren't they? :)

Thank Arch, that was very edifying.

Mike

Quoting Dmitriy Vyukov

There is always also an implementation complexity of course. However that's a different matter. If you decide to not penalize performance, each and every end user benefits. If you decide to not increase complexity, personally you benefit :)


Unfortunately excessive implementation complexity often results in a solution completely unsuitable for real world needs. For example, there is a slew of lock- and even wait-free algorithms out there. But how many of them are used in practice?

#77 "I'm fairly certain that concurrent_queue::push does not touch my_rep after the item becomes visible to the consumer. A pushed item does not become visible until method push executes the atomic fetch-and-add "tail_counter+=..." inside the representation, after which it does not touch the representation. That fetch-and-add is the notification mechanism. So it is safe for the consumer to delete it. "
Here's a piece of code from concurrent_queue_base_v3::internal_push() at concurrent_queue.cpp:385-386 in tbb30_20100406oss:

r.choose( k ).push( src, k, *this );
r.items_avail.notify( predicate_leq(k) );

First push(), which makes the item visible (right?), then use of r, which is a reference to *my_rep. That means that seeing a "last" item is not enough to "safely" delete the queue, if that is the goal.

That's not the concurrent_queue. concurrent_queue does not do any notifications.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

"That's not the concurrent_queue. concurrent_queue does not do any notifications."
Am I hallucinating, then? Or is there a less disturbing explanation? :-)

I guess you are looking at concurrent_bounded_queue.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

"I guess you are looking at concurrent_bounded_queue."
concurrent_queue.h:408: "using strict_ppl::concurrent_queue;"
concurrent_queue.h:43: "class concurrent_queue: public internal::concurrent_queue_base_v3 {"
concurrent_queue.h:166: "class concurrent_bounded_queue: public internal::concurrent_queue_base_v3"
So both non-bounded and bounded derive from internal::concurrent_queue_base_v3, which has the problematic code that I quoted earlier.

There is 2 different concurrent_queue_base_v3. One of which has the following implementation:

void concurrent_queue_base_v3::internal_push( const void* src ) {
concurrent_queue_rep& r = *my_rep;
ticket k = r.tail_counter++;
r.choose(k).push( src, k, *this );
}

All about lock-free algorithms, multicore, scalability, parallel computing and related topics: http://www.1024cores.net

"There is 2 different concurrent_queue_base_v3."
I have no words...

They are defined in different namespaces: tbb::internal and tbb::strict_ppl::internal. :)

Pages

Leave a Comment

Please sign in to add a comment. Not a member? Join today