Lockfree_mpmc version 1.12 ...

Lockfree_mpmc version 1.12 ...

aminer10's picture

Hello,

Look at: http://www.emadar.com/fpc/lockfree.htm

In the pop() side of this lockfree quueue we have:

--------------

function gFLQueue.pop(var tm:_R):boolean;
var newhead, lastHead : integer;

begin
repeat
lastHead:=head;
if tail<>head
then
begin
pointer(newHead):=interlockedCompareExchange(pointer(head),pointer(lastHead
+1),pointer(lasthead));
if newHead=lastHead
then
begin
tm:=getObject(lastHead);
result:=true; exit;
end; end
else
begin
result:=false;
exit;
end;
until false;
end;
end.
----

If you have noticed, after he is incrementing head with an
interlockedCompareExchange(), he is doing a tm:=getObject(lastHead);
and this is an error i think, cause if the thread
is prempeted and another thread have put another value in the same
place as lashead, the value willl be invalid and corrupted.
I have corrected this problem in my lockfree_mpmc fifo queue version
1.12
So, becarefull with those lockfree algorithms.

And i have tried to enhance my lockfree_mpmc and i have
used an exponential backoff in the pop side of the algorithm and it
gives better performance under contention with 28.317 millions pop transations
per second.

The queue with a windows critical section on both the push and the pop
sides of the algorithm scored very bad under contention with
4 threads. Here is the benchmarks using the windows critical section:

Only 370692 push transactions per second
and only 503024 push transactions per second

Now the list of memory ordering guarantees in the Intel x86 memory model:

Loads are not reordered with other loads.
Stores are not reordered with other stores.
Stores are not reordered with older loads.
In a multiprocessor system, memory ordering obeys causality (memory
ordering respects transitive visibility).
In a multiprocessor system, stores to the same location have a total
order.

In a multiprocessor system, locked instructions have a total order.
Loads and stores are not reordered with locked instructions.

But Loads may be reordered with older stores to different locations
But what is important for us is MEMORY VISIBILITY across threads..

So, the only variables in lockfree_mpmc that are used across threads
are tail and head.. And since in a multiprocessor system, locked instructions
have a total order. So, the lock cmpxchg (inside the CAS) in the push() side
of lockfree_mpmc does in fact achieve sequential consistency and
LockedIncLong(temp) in the push side of lockfree_mpmc 1.12 also
achieve sequential consistency

And tail and head are properly aligned to avoid false sharing...

Andof course i was speaking about sequential consistency in the x86 architecture..

You candownload lockfree_mpmc 1.12 from:

http://pages.videotron.com/aminer/

Amine Moulay Ramdane.

3 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
aminer10's picture

Hello,

How to verify that lockfree_mpmc is correct ?

You can for example reason like this

first , in the source code i am using a margin of

1000 threads, so you can not have more than 1000 threads

accessing the lockfree queue at the same time...

And here is the ObjectPascal source code of the push() side:

---

function TLockfree_MPMC.push(tm : tNodeQueue):boolean;

var lasttail,newtemp:longword;

begin

if getlength >= fsize

then

begin

result:=false;

exit;

end;

result:=true;

newTemp:=LockedIncLong(temp);

lastTail:=newTemp-1;

setObject(lastTail,tm);

repeat

if CAS(tail,lasttail,newtemp)

then

begin

exit;

end;

asm pause end;

until false;

end;

----

So, let say the size of the bounded lockfree_mpmc queue is 1000 , and

imagine that the threads are executing the "if getlength >= fsize " all at the same
time, and imagine that the getlength() returns 999, so, the
"if getlength >= fsize" will returns false , and since we have

fSize:=(1 shl aPower) - margin inside the constructor,

with a margin of 1000 (margin must be >= to the number of threads) ,
there will be nooverflow.

And here is the source code of the pop() side:

----

function TLockfree_MPMC.pop(var obj:tNodeQueue):boolean;

var lastHead : longword;

begin

repeat

lastHead:=head;

if tail<>head

then

begin

obj:=getObject(lastHead);

if CAS(head,lasthead,lasthead+1)

then

begin

result:=true;

exit;

end

else backoff2.delay;

end

else

begin

result:=false;

exit;

end;

until false;

end;

----

As you have noticed, there is a test like this:

if CAS(head,lasthead,lasthead+1) , and this test

avoids something like the ABA problem, cause if head

have changed , the CAS() will fail.

Thank you.

Amine Moulay Ramdane

tomorrowwillbefine's picture

Microsoft Office 2010 is actually the newest software from microsoft office 2010 keys Microsoft Corporation introduced in the last year. Its leading aims tend to be to catch the present business requirements and to be on top of every competition with regard to the international market criteria. This can be a very good idea to obtain Microsoft Office 2010 Key immediately to maintain norton antivirus keys yourself up-to-date and to present you with the vast qualified progress opportunities for success. Microsoft Office 2010 is available in both 32-bit and 64-bit editions, but attention please the two are not able to co-exist on the very same personal computer. All of the Office 2010 editions are kaspersky antivirus keys suitable for Windows XP SP3, Windows Vista and Windows 7.

www.keyyeah.com

Login to leave a comment.