bug in concurrent_bounded_queue?

bug in concurrent_bounded_queue?

Hi there,

The snippet below will crash after a while with an access violation somewhere below tbb_debug.dll!concurrent_queue_base_v3::internal_push() when running 1 tbb thread for consume() and 10 for produce(). This is reproducible on my Vista 64 Box (snippet is compiled to 32 bits/debug). I am using Studio 2008 SP1. Any ideas?

struct QueuedCall
{
void unblock() { myQueue.push(1); }
void block() {
int dummy;
myQueue.pop(dummy);
}

tbb::concurrent_bounded_queue myQueue;
};

tbb::concurrent_queue myPendingCalls;

void consume() {
do {
QueuedCall *call;
if (myPendingCalls.try_pop(call))
call->unblock();
} while (true);
}

void produce() {
for (;;) {
QueuedCall *c = new QueuedCall();
myPendingCalls.push(c);
c->block();
delete c;
}
}

89 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Quoting pp0

The snippet below will crash after a while with an access violation somewhere below tbb_debug.dll!concurrent_queue_base_v3::internal_push() when running 1 tbb thread for consume() and 10 for produce(). This is reproducible on my Vista 64 Box (snippet is compiled to 32 bits/debug). I am using Studio 2008 SP1. Any ideas?


That's not surprising since you delete an item straight after enqueue, so the queue basically transfers dangling pointers between threads.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

How so? I am synchronizing by blocking on the producer and unblocking on the consumer.

Ah, I see. I missed that.

Then perhaps the problem is that an item (hence the myQueue queue) is deleted between an item become available for consuming and return from push(). You must guarantee the myQueue's life-time till return from push().

Consider schematic push() implementation:

... push(...)
{
...
make an item available for consuming
...
// at this point consumer pops the item and deletes the queue (this)
..
some more operations with this' members, which result in paging fault because 'this' is already deleted
}

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Naive mutex-based queue will not have such a limitation, because actual enqueue and return from push() are atomic. Which may be not true for non-naive and/or non-mutex-based queue.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Dmitriy Vyukov
Then perhaps the problem is that an item (hence the myQueue queue) is deleted between an item become available for consuming and return from push(). You must guarantee the myQueue's life-time till return from push().If that's true, the concurrent_queue is completely pointless for any heap pointer since a consumer could never be guaranteed to be able to call delete w/o some other form of synchronization with the producer.

Mike

That is a great suggestion. It would be a weird implementation though. A cursory review of the code shows the last step in push() to be:
r.items_avail.notify( predicate_leq(k) );Which seems to indicate that the push() would not be visible until it actually terminates. The code is not particularly easy to read though so I could be wrong.-pp

> If that's true, the concurrent_queue is completely pointless for any
heap pointer since a consumer could never be guaranteed to be able to
call delete w/o some other form of synchronization with the producer.

Why you made such conclusion? You can indeed pass pointers and delete them on consumer side.
However, you must not delete the queue itself while there are threads still executing it's methods. That's the basic safety rule equally applicable to all types out there.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Dmitriy Vyukov
Why you made such conclusion? You can indeed pass pointers and delete them on consumer side.
However, you must not delete the queue itself while there are threads still executing it's methods. That's the basic safety rule equally applicable to all types out there.
You're right, I misunderstood what you were saying. I guess I expected (perhaps wrongly so) a bit more protection from thread issues in such a concurrency oriented product. All push / pop signals being set signifies to me that it should be safe to call the
destructor.

Mike

Quoting mwhenson
You're right, I misunderstood what you were saying. I guess I expected (perhaps wrongly so) a bit more protection from thread issues in such a concurrency oriented product. All push / pop signals being set signifies to me that it should be safe to call the
destructor.

I still can't make up my mind on this. Whether queues must support such usage patterns or not. But I am inclining to the answer No, they do not have to.

There is a lot of "strange" usage patterns a user can think up. For example consider:

queue* q = new queue;

q->enqueue(1);

// thread 1

if (false == q->dequeue())

delete q;

// thread 2

if (false == q->dequeue())

delete q;

So enqueue() must be strictly lineariazable too to support such pattern.

Then, it seems that for each "strange" use pattern there is a "normal" implementation that achieves the same.

Then, support for such "strange" use patterns can penalize performance for "normal" use patterns.

So I guess it's Ok to just prohibit such "strange" use patterns.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Humm... I've found an example which is both reasonable/realistic and not easy way to rewrite correctly:

struct tcp_connection
{
  queue_t q;
  ...
};

void connection_thread(tcp_connection* c)
{
  for (;;)
  {
    msg_t* m = c->dequeue();
	if (m->type == MSG_DISCONNECT)
	{
	  delete c;
	  return;
	}
	process(m);
  }
}

void io_dispatch_thread()
{
  for (;;)
  {
    select(...);
	for_each(SOCKET s in read_set)
	{
	  recv(s, buf, size);
	  tcp_connection* c = find_connection(s);
	  c->enqueue(msg_t(MSG_PACKET, data, size));
	}
	for_each(SOCKET s in exception_set)
	{
	  tcp_connection* c = find_connection(s);
	  c->enqueue(msg_t(MSG_DISCONNECT));
	}
  }
}


All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

I am curious as to what TBB team thinks on this.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Are not smart pointers the usual solution for this type of problems?

Quoting Alexey Kukanov (Intel)

Are not smart pointers the usual solution for this type of problems?

What type of problem?

The problem is that while one uses mutex-based queue he has no problems, and once he switches to TBB he starts getting chaotic memory corruptions and crashes.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Dmitriy Vyukov
Humm... I've found an example which is both reasonable/realistic and not easy way to rewrite correctly:Correct me if I'm wrong, but the difference between this and the original example is that this would also be wrong w a mutex based queue.

Mike

Quoting mwhenson

Quoting Dmitriy Vyukov
Humm... I've found an example which is both reasonable/realistic and not easy way to rewrite correctly:Correct me if I'm wrong, but the difference between this and the original example is that this would also be wrong w a mutex based queue.

Please, elaborate. I do not see why my example won't work with a mutex-based queue.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Dmitriy Vyukov

Please, elaborate. I do not see why my example won't work with a mutex-based queue.

Maybe I didn't see the correct problem then; can you explain how your example could fail?

Mike

dispatch thread posts disconnect message:
c->enqueue(msg_t(MSG_DISCONNECT));
but not yet return from enqueue()

Connection thread receives the message:
msg_t*m=c->dequeue();
if(m->type==MSG_DISCONNECT)

and deletes the queue:
deletec;

Now dispatch thread crashes in enqueue().

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

And now how it would still fail even with a smart pointer (with both producer and consumer holding a reference to the queue)? :-)

I guess it won't. But it won't make the queue linearizable either.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Raf Schietekat
And now how it would still fail even with a smart pointer (with both producer and consumer holding a reference to the queue)? :-)

It shouldn't, but that ignores the point that many people, especially to those with experience with a mutex based queue, will probably find this behavior initially surprising and not initially intuitive.

Mike

Quoting mwhenson

Quoting Raf Schietekat
And now how it would still fail even with a smart pointer (with both producer and consumer holding a reference to the queue)? :-)

It shouldn't, but that ignores the point that many people, especially to those with experience with a mutex based queue, will probably find this behavior initially surprising and not initially intuitive.


The queue is not linearizable:

http://en.wikipedia.org/wiki/Linearizability

Linearization
points

This definition of linearizability is equivalent to the following:

  • All function calls have a linearization point at some instant
    between their invocation and their response
  • All functions appear to occur instantly at their linearization
    point, behaving as specified by the sequential definition

This alternative is usually much easier to prove. It is also much
easier to reason about as a user, largely due to its intuitiveness. This
property of occurring instantaneously, or indivisibly, leads to the use
of the term atomic as an alternative to the longer
"linearizable".

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

It's fine by me if it's not lineralizable - if the documentation mentioned its non-linearlizability, which it doesn't, or at least not that I can find. Because, if we don't know this, as wikipedia mentions:

On the other hand, if an operation is not atomic, the user will also
need to understand and cope with sporadic extraneous behaviour caused by
concurrent operations, which by its nature is likely to be hard to
reproduce and debug.

Mike

Also, thanks for your help.

Mike

"I guess it won't. But it won't make the queue linearizable either."
In what sense?

I think it's somewhat careless to add yet another overload to what "atomic" is supposed to mean. It seems a bit like sequential consistency, both in what it does and in what it means to want to be able to take it for granted: very appealing at first sight, but would you want to pay the probably unavoidable cost? Nonlinearisability just comes with the territory, I'd say.

I read moreinfo on linearizability, thought more of it, and discussed in the team. My vision on the issue is:
- Linearizability is a property of a set of functions (invocations and responses), in other words- a property of a synchronization protocol used to coordinate these functions.
- Alinearizable function is not obliged to havea single linearization point; it can have distinct such points for distinct other functions from the set.
- Some methods of a concurrently accessed object might be linearizable, some might not.
- TBB concurrent_queue meets the documented promise: push() and pop() methods are linearizable, while destruction is not thread safe at all.
- I do not see other practical way than reference counting (smart pointers) to make object destruction thread safe and possibly linearizable. In particular, destruction of a lock-protected queue is not linearizable either.

To prove the last item, how about the following simple history for a queue that has methods push(), pop(), and destroy(), allprotected withsame lock?
thread 1 calls queue.destroy()
thread 2 calls queue.push()
thread 1 returns from queue.destroy()
thread 2 returns from queue.push()... or maybe crashes.
Easy to see that each thread's view of the history is valid; the thread 2 did not call destroy() and it could not know some other thread did, because at the moment it called push() the queue was alive.

What I wanted to show is that a special synchronization protocol is required to make destruction thread-safe. Moreover, the protocol Dmitry assumed (the producer pushes a special signal, the consumer destroys the queue after receiving the signal) is insufficient in general, because it does not work if there are several producers or consumers. And TBB concurrent_queue is multi-producer multi-consumer queue.

The bottom line is that I do not see the described issue as lack of linearizability, or as inconsistency with documentation. The solution for it, as I already suggested, is to use smart pointers.

Quoting Alexey Kukanov (Intel)
- Alinearizable function is not obliged to havea single linearization point; it can have distinct such points for distinct other functions from the set.

Nope.

Every function must single linearization point. Linearizability is observable sequential consistency. If a function has several linearization points then the following scenario would be possible.

Thread 0 partially executes function A.

Thread 1 observes an effect of the invocation of function A by means of function B.

Thread 1 communicates to thread 2 that A has completed.

Thread 2 receives the message from thread 1.

Thread 2 tries to verify that function A has indeed completed by means of function C, but fails.

For example consider that function queue::enqueue() is implemented as: (1) increment queue size, (2) enqueue a message. Then:

Thread 0 partially executes queue::enqueue() (i.e. a size is incremented, but a message is not enqueued).
Thread 1 executes queue:size(), and obtains a value 1.
Thread 1 communicates to thread 2 that the queue is not empty.
Thread 2 executes queue::dequeue(), but obtains zero pointer (thread 0 is still not completed enqueue).
Thread 2 dereferences zero pointer. Bang!

Such scenario is impossible with linearizable queue.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

However, it's still may be useful to think in terms of "partial linearization points" (when a function has several linearization points) in some contexts. But that does not make a data structure linealizable as a whole.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Alexey Kukanov (Intel)
- I do not see other practical way than reference counting (smart pointers) to make object destruction thread safe and possibly linearizable. In particular, destruction of a lock-protected queue is not linearizable either.

Alexey, you do not get the point.

Destruction is not linearizable, and should not be linearizable, it's strictly single-threaded operation.

BUT one may use linearizability of operations on a data structure to determine a point when it's safe to destroy it (we are not talking about contents of destructor, only about a point when it's safe to destory).

Consider:

A consumer thread receives a last message from a producer thread. Now it "thinks": "ok, it's a last message, so producer does not use the queue any more, now I may delete the queue". And linearizable queue (mutex-based) permits such scenario, while TBB queue (as well as M&S queue) does not.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Alexey Kukanov (Intel)
To prove the last item, how about the following simple history for a queue that has methods push(), pop(), and destroy(), allprotected withsame lock?
thread 1 calls queue.destroy()
thread 2 calls queue.push()
thread 1 returns from queue.destroy()
thread 2 returns from queue.push()... or maybe crashes.
Easy to see that each thread's view of the history is valid; the thread 2 did not call destroy() and it could not know some other thread did, because at the moment it called push() the queue was alive.

It has nothing to do with the topic, it's plain programming error. A thread should not destroy objects while they are in use. A thread must determine a point when it's safe to destroy an object first.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Alexey Kukanov (Intel)
What I wanted to show is that a special synchronization protocol is required to make destruction thread-safe.

Indeed.

What you missed is that the protocol can include return values from methods of the object (dequeue returned LAST_MESSAGE -> safe to destroy).

Quoting Alexey Kukanov (Intel)
Moreover, the protocol Dmitry assumed
(the producer pushes a special signal, the consumer destroys the queue
after receiving the signal) is insufficient in general, because it does
not work if there are several producers or consumers. And TBB
concurrent_queue is multi-producer multi-consumer queue.

Nope. TBB concurrent_queue IS-A single-producer/single-consumer queue while you do not provide tbb::single_producer_single_consumer_concurrent_queue.
Or you want to say that documentation/intention prohibits usage of tbb::concurrent_queue as a spsc queue? Do you?

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Dmitriy Vyukov
Consider:

A consumer thread receives a last message from a producer thread. Now it "thinks": "ok, it's a last message, so producer does not use the queue any more, now I may delete the queue". And linearizable queue (mutex-based) permits such scenario, while TBB queue (as well as M&S queue) does not.

I'm not sure I get this. Assume a queue is linearizable -- that its push and pop operations each have a single linearization point. These points govern what is visible by one operation, with respect to another. They occur at some point between the invocation and response of the operation. So a linearization point on the producer inserting the "last" element has passed, but the push operation may not have terminated yet. The pop operation "sees" the pushed element, takes it out, passes its linearization point, responds. Then the destructor is called. But nothing about those linearization points tells you that the push operation has completed and that it is safe to destroy the queue...? Or am I missing something?

Thanks!

Terry

The bottom line is that I do not see the described issue as lack of linearizability, or as inconsistency with documentation. The solution for it, as I already suggested, is to use smart pointers.

I respectfully disagree. In the 1st post example, the consumer knows the producer has stopped producing, knows he's the only consumer, knows he's returned from the only pop call, then tried deleting. Had there been a lock around push / pop / destroy calls, the call to pop would not have returned until it was safe to delete in that example.

I'm not an expert on the definition of linearization, but this seems inconsistent with the definition presented here and in wikipedia. It's definitely inconsistent with a naive implementation with a single lock around everything. When we first ran into this problem, my first action was to check the documentation, but found no clues. I reiterate that this behavior is surprising to someone used to dealing with more naive implementations of a concurrent queue and suggest a note in the documentation if appealing to a less expert audience is a goal. I agree that a smart pointer should solve the problem, but that's not too useful when you have no reason to expect to need another layer of synchronization.

Mike

Indeed your example is not linearizable in any sense, including my understanding. Such implementation would be just buggy.

In my understanding, enqueue() and dequeue() implementations should also synchronize internally, e.g. viaa readiness flag for the enqueued element. In this case, atomicincrement of size would serve as linearization point for size(), and setting the flag would serve as linearization point for dequeue().

The way to observe an object is to call some method of it. I say that making every method adhere to linearizability rules is very costly in general, and so is often impractical. Let's consider the example with the lock-based queue again. Let the producer thread push the completion signal, unlock the queue, and right after that be preempted while still inside the push() method. Let the consumer thread receive the signal, destroy the queue, _and_ unload the modulethat implements the queue- that seems to be safe because nobody is using the queue anymore, does not it? And then the producer threadresumes, and finds itself executing an epilogue of a function which no more exists. Ooops.

I will not comment on linearizability of the concurrent_queue, and I even take back my previous comment about push() and pop() being linearizable with respect to each other, because:
1) for the moment,we did not promise it in the documentation;
2) I did not prove it myself, neither I know of anyone who did.

Quoting mwhensonI respectfully disagree. In the 1st post example, the consumer knows the producer has stopped producing, knows he's the only consumer, knows he's returned from the only pop call, then tried deleting. Had there been a lock around push / pop / destroy calls, the call to pop would not have returned until it was safe to delete in that example.

I'm not an expert on the definition of linearization, but this seems inconsistent with the definition presented here and in wikipedia. It's definitely inconsistent with a naive implementation with a single lock around everything. When we first ran into this problem, my first action was to check the documentation, but found no clues. I reiterate that this behavior is surprising to someone used to dealing with more naive implementations of a concurrent queue and suggest a note in the documentation if appealing to a less expert audience is a goal. I agree that a smart pointer should solve the problem, but that's not too useful when you have no reason to expect to need another layer of synchronization.

Mike

What the consumer does not know however is whether the producer really returned from the enqueue call. And linearizability does not provide such a promise. All it promisesis that the operation effects important for the consumerare observed by the consumer at some instant point, no matter in what sequence and what time the calls were made. See my previous post - even with the lock based queue, no guarantee can be made that the producer has really returned from the call! Just that it has done its work.

Yes a note in the documentation might be reasonable. The note should tell that making destruction of a concurrent object thread-safe by means of the object is impossible in principle, and that additional means/protocols, such as smart pointers, should be used at the application level.

Quoting Terry Wilmarth (Intel)I'm not sure I get this. Assume a queue is linearizable -- that its push and pop operations each have a single linearization point. These points govern what is visible by one operation, with respect to another. They occur at some point between the invocation and response of the operation. So a linearization point on the producer inserting the "last" element has passed, but the push operation may not have terminated yet. The pop operation "sees" the pushed element, takes it out, passes its linearization point, responds. Then the destructor is called. But nothing about those linearization points tells you that the push operation has completed and that it is safe to destroy the queue...? Or am I missing something?

Thanks!

Terry

It seems tome that Dmitry for some reason assumes that after the linearization point has passed the method cannot even touch (and much less change) any state of the queue. This is obviously impractical, and this does not follow from the definition of linearizability.

We have already discussed in the team that any attempt to make destruction of the concurrent object safe from inside the object is meaningless, because the container itself can not guarantee that no other thread would touch it the very next moment. For that purpose there should be additional protocols appliedwhich a generic container obviously can not know.

Dmitry suggested such a protocol that he thinks is suitable for single producer single consumer queues, based on the observation that it works with the lock-based queue. However let'a apply formal logic. We have two postulates:
- lock-based queue is linearizable;
- lock-based queue supports the described protocol for safe destruction.
From these postulates, does it follow that any linearizable spsc queue supports the described protocol? No it does not.

Quoting Terry Wilmarth (Intel)

I'm not sure I get this. Assume a queue is linearizable -- that its push and pop operations each have a single linearization point. These points govern what is visible by one operation, with respect to another. They occur at some point between the invocation and response of the operation. So a linearization point on the producer inserting the "last" element has passed, but the push operation may not have terminated yet. The pop operation "sees" the pushed element, takes it out, passes its linearization point, responds. Then the destructor is called. But nothing about those linearization points tells you that the push operation has completed and that it is safe to destroy the queue...? Or am I missing something?

My reasoning is as follows.

For a linearizable data structure observable side effects of an operation are either (1) not take place at all or (2) all take place together. There is no partial states.

One of observable side effect is "return from the function" (by return I mean that a thread "stopped touching the object" rather than a physical return, i.e. execution of a RET instruction).

So if a consumer has observed that the message is in the queue then the producer stopped touching the object (provided that the object is linearizable).

And if the producer did not stop touching the object, then consumer may observe partial state (by unmapping the memory, and catching SIGSEGV). Thus the data structure in not linearizable.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Alexey Kukanov (Intel)
In my understanding, enqueue() and dequeue() implementations should also synchronize internally, e.g. viaa readiness flag for the enqueued element. In this case, atomicincrement of size would serve as linearization point for size(), and setting the flag would serve as linearization point for dequeue().

Sequential consistency (linearizability) does not have any sense for a single thread (the order is just what is observed). SC only have meaning when we consider a plularity of observers (their partial observed orders must form a consistent total order).

So what you described is not sequentially consistent (linearizable), because if threads that execute size() and dequeue() communicate with each other (or with a third observer), they will reveal the inconsistency. Namely, size() returns 1 (i.e. enqueue has taken place) and AFTER that dequeue() return 0 (i.e. enqueue has not taken place).

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Dmitriy VyukovSequential consistency (linearizability) does not have any sense for a single thread (the order is just what is observed). SC only have meaning when we consider a plularity of observers (their partial observed orders must form a consistent total order).

So what you described is not sequentially consistent (linearizable), because if threads that execute size() and dequeue() communicate with each other (or with a third observer), they will reveal the inconsistency. Namely, size() returns 1 (i.e. enqueue has taken place) and AFTER that dequeue() return 0 (i.e. enqueue has not taken place).

This time it seems you did not get my point :)
In what I described, the dequeue would not return at all until it sees the object isfully constructed and could be obtained. It would be impossible to make a wait-free dequeue calllinearizable, but possible with a blocking dequeue().

Thanks for the clarification everyone, I found it useful.

Mike

Quoting Alexey Kukanov (Intel)

It seems tome that Dmitry for some reason assumes that after the linearization point has passed the method cannot even touch (and much less change) any state of the queue. This is obviously impractical, and this does not follow from the definition of linearizability.

It does follow. I can observe that the operation is partially committed: the message is available for consuming, but the thread is not stopped touching the object. "Touching the object" is observable side effect because, well, I am able to observe it.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Alexey Kukanov (Intel)

We have already discussed in the team that any attempt to make destruction of the concurrent object safe from inside the object is meaningless, because the container itself can not guarantee that no other thread would touch it the very next moment. For that purpose there should be additional protocols appliedwhich a generic container obviously can not know.

So is the following implementation of reference ocunting Ok with you?

bool release_object(object_t* obj)
{

int rc = obj->rc.fetch_sub(1, memory_order_acq_rel);

obj->last_access.store(get_timestamp(), memory_order_relaxed);

return rc == 1;

}

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Dmitriy VyukovIt does follow. I can observe that the operation is partially committed: the message is available for consuming, but the thread is not stopped touching the object. "Touching the object" is observable side effect because, well, I am able to observe it.

Byunmapping thememory page which the object resides in? Ohwell :) This way you can observe almost everything, e.g. changes of automatic variables on the stack of another thread.

To me, linearizability is about a consistent behavior of the agreed upon set of functions operating on an object. These functions,when executed concurrently,should agree with each other about what they do with the object, in a predictable manner. Observations are only done by calling these functions and analyzing what it returned, and there should be a sequentially consistent history of calls and returns, including argument values and return values. It does not matter if the functions can see partial activity of concurrent operations internally; actually, in most cases they do. It does matter that when a function returns,its results areexplainable given what thethread may know about the history of previous operations with the object.

Quoting Alexey Kukanov (Intel)

What the consumer does not know however is whether the producer really returned from the enqueue call. And linearizability does not provide such a promise. All it promisesis that the operation effects important for the consumerare observed by the consumer at some instant point, no matter in what sequence and what time the calls were made. See my previous post - even with the lock based queue, no guarantee can be made that the producer has really returned from the call! Just that it has done its work.

Yes a note in the documentation might be reasonable. The note should tell that making destruction of a concurrent object thread-safe by means of the object is impossible in principle, and that additional means/protocols, such as smart pointers, should be used at the application level.

Good point wrt unloading the module.

I guess the problem boils down to what is observable effect. I see 3 variants:

(1) return values from the object's methods

(2) (1) + touching the object's data in any way

(3) (2) + executing code from the object's methods

All of these can be observed, so strictly speaking must be followed for 100% linearizable object.

(1) is a kind of basic guarantee. We all agree here.

(3) IMVHO is too special case, dyanmic module loading/unloading is used in a few programs and even in them is not a top priority operation.

(2) IMVHO is what must be included in the definition of plain linearizability (when you do not do any special remarks). Because object creation/destruction is present in basically all programs and moreover often it's a frequent operation.

Note that for some concurrent data structures (2) is not only "nice" property, but acknowledged requirement (namely, mutex, reference counted object, etc).

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

And note that only (2) is provided by mutexes. And I guess the intent behind linearizability is "as if protected by a mutex".

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Alexey Kukanov (Intel)

Byunmapping thememory page which the object resides in? Ohwell :) This way you can observe almost everything, e.g. changes of automatic variables on the stack of another thread.

Not necessary unmapping the memory. Just deleting the object.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Alexey Kukanov (Intel)

To me, linearizability is about a consistent behavior of the agreed upon set of functions operating on an object. These functions,when executed concurrently,should agree with each other about what they do with the object, in a predictable manner. Observations are only done by calling these functions and analyzing what it returned, and there should be a sequentially consistent history of calls and returns, including argument values and return values. It does not matter if the functions can see partial activity of concurrent operations internally; actually, in most cases they do. It does matter that when a function returns,its results areexplainable given what thethread may know about the history of previous operations with the object.

It's Ok to define some special agreements wrt set of methods and observable effects, and even use the term "linearizability" with according remarks.

But if it is not done, "linearizability" is "as if protected by a mutex". That's what everybody expects.

For some concurrent algorithms "touching the object's data" is not just an observable effect is THE observable side effect:

bool release_obeject(obj)

{

pthread_mutex_lock(&obj->mtx);

int rc = --obj->rc;

obj->timestamp = get_timestamp();

pthread_mutex_unlock(&obj->mtx);

return !rc;

}

vs:

bool release_obeject2(obj)

{

int rc = obj->rc.fetch_sub(1) - 1;

obj->timestamp.store(get_timestamp());

return !rc;

}

I insist that release_obeject2() is not linearizable just because it touches the object after committing one of the side effects.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Dmitriy Vyukov
And note that only (2) is provided by mutexes. And I guess the intent behind linearizability is "as if protected by a mutex".

No, linearizability is a theoretical abstraction, and mutexes is just one way to implement it in practice.

Reasoning in terms of "too special" or "frequent operation" you are trying to attribute properties that are unrelated to the abstraction.

One can define the scope of applicability of the theoretical concept to a practical implementation, including both operations set and observable effects. In case of tbb::concurrent_queue it is push() and pop() operations and their argument and return values. And linearizability means that once push() method returns, it does not access its argument anymore, and the value it inserted into the queue is available for subsequent pops. And once a pop() returned a vaule, this value is inaccessible to any other pop() call. Nothing more.

Atomicity, which is often associated with linearizability (though original definition des not involve it), just means that effects of concurrently running operations do not interfere with each other, even when the operations may overlap in time. E.g. Herlihy writes that "each operation should appear to take effect instantaneously", not that a single store (atomic at hardware level) must be used to implement linearizability.

Quoting Andrey Marochko (Intel)

Reasoning in terms of "too special" or "frequent operation" you are trying to attribute properties that are unrelated to the abstraction.

Indeed. That's what I meant by "All of these can be observed, so strictly speaking must be followed for
100% linearizable object."

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Andrey Marochko (Intel)

No, linearizability is a theoretical abstraction, and mutexes is just one way to implement it in practice.

Yes. But this does not contradict to what I said. Linearizability is a theoretical abstraction, which is build with "as if protected with mutex" in mind (intention behind)). Thus, mutex automatically is one way to implement it.

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Quoting Andrey Marochko (Intel)
Herlihy writes that "each operation should appear to take effect instantaneously", not that a single store (atomic at hardware level) must be used to implement linearizability.

Yes, but "should *appear* to take effect instantaneously" means that it should be indistinguishable in any externally observable sense from implementation that uses single atomic store. Which is not the case for algorithms that touch object's data after lineariataion point (TBB queue, M&S queue).

All about lock-free algorithms, multicore, scalability, parallel computing and related topics:
http://www.1024cores.net

Pages

Leave a Comment

Please sign in to add a comment. Not a member? Join today