Hello,
Look at: http://www.emadar.com/fpc/lockfree.htm
In the pop() side of this lockfree quueue we have:
--------------
function gFLQueue.pop(var tm:_R):boolean;
var newhead, lastHead : integer;
begin
repeat
lastHead:=head;
if tail<>head
then
begin
pointer(newHead):=interlockedCompareExchange(pointer(head),pointer(lastHead
+1),pointer(lasthead));
if newHead=lastHead
then
begin
tm:=getObject(lastHead);
result:=true; exit;
end; end
else
begin
result:=false;
exit;
end;
until false;
end;
end.
----
If you have noticed, after he is incrementing head with an
interlockedCompareExchange(), he is doing a tm:=getObject(lastHead);
and this is an error i think, cause if the thread
is prempeted and another thread have put another value in the same
place as lashead, the value willl be invalid and corrupted.
I have corrected this problem in my lockfree_mpmc fifo queue version
1.12
So, becarefull with those lockfree algorithms.
And i have tried to enhance my lockfree_mpmc and i have
used an exponential backoff in the pop side of the algorithm and it
gives better performance under contention with 28.317 millions pop transations
per second.
The queue with a windows critical section on both the push and the pop
sides of the algorithm scored very bad under contention with
4 threads. Here is the benchmarks using the windows critical section:
Only 370692 push transactions per second
and only 503024 push transactions per second
Now the list of memory ordering guarantees in the Intel x86 memory model:
Loads are not reordered with other loads.
Stores are not reordered with other stores.
Stores are not reordered with older loads.
In a multiprocessor system, memory ordering obeys causality (memory
ordering respects transitive visibility).
In a multiprocessor system, stores to the same location have a total
order.
In a multiprocessor system, locked instructions have a total order.
Loads and stores are not reordered with locked instructions.
But Loads may be reordered with older stores to different locations
But what is important for us is MEMORY VISIBILITY across threads..
So, the only variables in lockfree_mpmc that are used across threads
are tail and head.. And since in a multiprocessor system, locked instructions
have a total order. So, the lock cmpxchg (inside the CAS) in the push() side
of lockfree_mpmc does in fact achieve sequential consistency and
LockedIncLong(temp) in the push side of lockfree_mpmc 1.12 also
achieve sequential consistency
And tail and head are properly aligned to avoid false sharing...
Andof course i was speaking about sequential consistency in the x86 architecture..
You candownload lockfree_mpmc 1.12 from:
http://pages.videotron.com/aminer/
Amine Moulay Ramdane.



