Memory barriers and concurrent queue question

Memory barriers and concurrent queue question

Hi,

I have two threads, where one is a consumer and other is a producer.

Lets assume that the producer generates 1K of data at a memory location, and would like to tell the consumer thread to process this data. Is it enough for me to transfer this data by pushing a pointer to the data on a tbb's concurrent queue? Or do I have to explicitly add a write memory barrier and a read memory barrier so that the change can be seen on the other thread?

Basically, I want to guarantee that the data that the consumer thread sees after dereferencing the pop'd value is going to be what the producer thread generated.

Another way to ask the question is: does tbb's queue have any memory barriers built in?

8 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Ilnar's picture

Another way to ask the question is: does tbb's queue have any memory barriers built in?

yes, it does

____________________ Борханов Ильнар

So, what you're saying is that regardless on any architecture, the following code will never print out the std::cerr statement and I don't need the lines that are commented out?

Can you please elaborate on your post, since I am still kind of skeptical.

So, what guarantees the following things:
1) the store to NonAtomicInt will not be reordered by the compiler
2) the load of NonAtomicInt will actually happen, and the compiler won't use a register (i have seen some people use volatile for this, but this has other implications as well)
3) the load will not be reordered by the compiler

bonus points (Non x86-64 hardware):
4) the store to NonAtomicInt will not be reordered by the hardware, e.g. on DEC
5) the load of NonAtomicInt will not be reordered by hardware, e.g. on DEC

I know that you can write some fancy stuff with C++11, but the examples only work with 1 or 2 ints.

What if mNonAtomicInt was a memory region of 1K bytes, and not just 1 int?
C++11 examples are welcome as well.


#include 
#include 
#include 
class Tester
{
public:
	Tester()
		: mNonAtomicInt(0)
	{
        mAtomicInt = 0;

        boost::thread t2(boost::bind(&Tester::Consume,this));
		boost::thread t1(boost::bind(&Tester::Produce,this));

		t1.join();
		t2.join();
	}


	void Produce()
	{
		mAtomicInt = 1;
		mNonAtomicInt = 1;
		///////_ReadWriteBarrier(); //compiler fence
		///////tbb::atomic_fence();
		mQ.push(1);
	}

	void Consume()
	{
		int j(0);

		mQ.pop(j);
		///////_ReadWriteBarrier(); //compiler fence
		///////tbb::atomic_fence();
		if (mNonAtomicInt != mAtomicInt)
		{
			std::cerr << "mNonAtomicInt != mAtomicInt " << mNonAtomicInt << "t" << mAtomicInt << std::endl;			
		}
	}

private:
	int									mNonAtomicInt;
	tbb::atomic					mAtomicInt;
	tbb::concurrent_bounded_queue	mQ;
};



int main()
{
	std::cout << "Hello world." << std::endl;

	Tester t;

	return 0;
}
jimdempseyatthecove's picture

The ctor of Tester() is not initializing mAtomicInt (the tbb::atomic<...> has no ctor).

Jim Dempsey

www.quickthreadprogramming.com

I thought when the tbb docs said "You can rely on zero-initialization to initialize an atomic to zero" (In section Why atomic has no constructors), that it would be initialized to zero?

The main question still stands though. For now, I have edited the constructor above.

jimdempseyatthecove's picture

IIF your total application is that represented by the code in the first post (plus the initializer for the atomic), then the two flags get set once on the first (only) Produce while the Consume had not yet run .OR. is blocked at the pop from the queue code. Therefore your Consume should always observe 0.

However, should you expand your program, and have Consumeclear these two flags as it extracts data, and then Produce and Consume many items, then consume might observe one flag being 1 and the other being 0. The reason being is that Produce may be in the process of settingbothof these flags while Consume is in the process of clearing both of these flags. When the two stores "collide" one will follow the other, and the follower's value will be that which persists. The situation exacerbates as you add additional producers and consumers to the queue.

Jim Dempsey

www.quickthreadprogramming.com

Thank you Jim.

I am not sure what you mean by observing 0. At which point and which variable? Near the end of Cosume both mAtomic and mNonAtomic int should print 1, right?

As for the expansion of this program, this is just a dummy program only designed for 2 threads and to only exchange an int once, and nothing more. I am just trying to get the concept down for this example only. This keeps the example code simple. If I were to do something more general, it would add a lot more code, and I am trying to keep it simple so that other forum readers actually read the code. :)

The question I wanted to ask is more about sequential consistency and happens-before, and how that relates to tbb.

jimdempseyatthecove's picture

I was merely cautioning you about when you expand the code. Because your example is using a state flag, presumably either a counter or an indication flag. The caution is in regard to the time interval between the inc and push (or push and inc), as well as dec and pop (or pop and dec).

When you feed one item in and consume it before you feed the next item in, the observed states will be as your code expects.

However, should your code feed (push_back) additional entries prior to consume of entry, then the consuming thread may observe counts, or flag, with a state other than what you intend it to be. Example

inc
<------- observe
push

The observer (Consume) will observe a count one larger than in the queue (or one less in the case of push then inc). A similar situation exists when the Consume code dec, pop (or pop, dec) and the observation (if any) falls in between the operations.

If your intention is to use the flag/counter to startup an ancilliary task on first fill, then your code (as written) will either start multiple consume tasks or not start a consumer task when required. These two failures will (may)occur when the observation (by either producer or consumer)occurs in this critical section. Some threading paradigms use a critical section to avoid these types of failures. A critical section though, is a heavyweight cure for a lightweight problem. If you want, I can expand upon this.

Jim Dempsey

www.quickthreadprogramming.com

Login to leave a comment.