n-bodies: a parallel TBB solution: parallel code: so what does TBB_USE_THREADING_TOOLS do?

Our East coast Parallelism Road Show was a success, and having finally caught up with some of the work that piled up while I was gone, I’ll squeeze enough time at least to add a footnote to a previous rambling.

In my last bumbling about, I tried defining the TBB_USE_THREADING_TOOLS macro as a stab to find the problem with my Intel® Parallel Inspector analysis of nbodies.  Didn’t seem to do much at the time, so I thought it might be interesting to find out what it really does.  It was easy to find examples of it in the open source.  spin_mutex.h contains has a scoped lock constructor:

        //! Construct and acquire lock on a mutex.
        scoped_lock( spin_mutex& m ) {
            my_unlock_value = __TBB_LockByte(m.flag);

There is a bit of getting the cart before the horse to examine the details of a lock before even talking about races in my way-slower-than-expected narrative exploring the effort to parallelize some code, but it seems appropriate as a footnote.

So, what’s going on up there?  In the non-TBB_USE_THREADING_TOOLS case something called __TBB_LockByte is being called with a field of the spin mutex object (probably a byte?), which must be the lock part (a gate where only one thread gets by at a time).  Then the spin mutex object is stashed until later.  If multiple threads tried to do this __TBB_LockByte call at the same time, they might face some contention with each other, and some tool designed to detect those dataraces might flag this operation as suspect.

On the other hand, when TBB_USE_THREADING_TOOLS is asserted, it looks like local mutex pointer is set to a safe value and the mutex itself is passed to some other function, internal_acquire(), effectively hiding any lock funny business from our correctness inspection tool.  So that’s what it does.  Maybe after I introduce scoped locks, I’ll come back here and peel another layer, and we can look at the alternate implementations of the lock.

Para obtener información más completa sobre las optimizaciones del compilador, consulte nuestro Aviso de optimización.