Pondering Timed Mutex

On the Intel® Software Network, TBB Forum, is posted a message requesting support of timed mutex (/en-us/forums/intel-threading-building-blocks/topic/60190/).

Our initial reaction to the post was "Why not?''  The POSIX standard has pthread_mutex_timedlock() which provides the functionality [1].  The latest C++200x Standard working draft [2] includes timed_mutex and recursive_time_mutex (see Section 30.3.2) which were voted in at the C++ Standards Committee's 2007 Kona meeting (see also [3,4]).  No time was wasted and a proposal was made to extend the TBB wrapper mutexes (tbb::mutex and tbb::recursive_mutex) to provide the functionality while leaving the other mutexes (such as tbb::spin_mutex) intact.

Then, a question of uniformity and usability arose.  For example, a user may define a template class and pass a timed mutex as a template argument.  We don't want users to remember which mutexes have the feature and which don't.  This led us to mulling about adding a timed variant of try_acquire() to all TBB mutexes.  Unfortunately, it raised more questions than it resolved.

As Arch succinctly explained in one of the TBB Forum responses, ``the initial design of TBB targeted a paradigm of task-based programming for parallel speedup....  mutexes were designed with that in mind; i.e., to avoid race conditions.  Blocking was expected to be short, otherwise the program would not scale.  Therefore timeouts were not a priority.''  This is also why TBB lacks support of condition variables.  So, have the TBB priorities changed?  In my opinion, no, they haven't.  Besides, TBB spin_mutex is named to reflect its spin-wait nature.  Can we still call it spin_mutex after we add wait/time-out to it?

Secondly, platforms differ in the degree to which they support timed mutexes.  Windows CriticalSection [5] implicitly supports timed mutex, although a user cannot change the time-out interval per instance basis.  The support seems mainly for debugging purposes.  pthread_mutex_lock() does not have a notion of time-out; if a deadlock occurs, the application remains dead-locked until explicitly aborted.  Instead, POSIX specifies pthread_mutex_timedlock(), a sibling of pthread_mutex_lock() which allows a user to specify a time-out interval, but its support is only optional (see (http://people.redhat.com/drepper/posix-option-groups.html; http://lists.apple.com/archives/Unix-porting/2008/Jan/msg00014.html)  Then, we found that even the C++ standardization committee at the 2007 Kona meeting failed to unanimously vote for inclusion of timed_mutex in the working draft [6].

This string of the events got me to look for compelling use cases of timed mutexes.  To my disappointment I have not been so successful.  All that I found was a class of real-time applications with a hard deadline which need timed mutexes as a fall-back, which is not TBB's focus area.  Andrey Marochko suggested self-restarting servers as a use case -- if a server cannot acquire a mutex for a pre-set period of time, it may assume that an unrecoverable error has occurred and restart itself.  Again, TBB does not cater to this type of applications, at least for now.  Then, I came across Hans Boehm's paper [7] in which he sounded skeptical about supporting timed mutexes in C++200x.  The paper even quoted Doug Lea's comments about the timed mutex methods in Java®, which sounded equally skeptical.

Nevertheless, there are TBB users who would like to use timed mutexes and we don’t want to disfranchise them.  Furthermore, either new breeds of use cases may emerge and timed mutexes will be adopted as part of the C++200x standard, or none are found and timed mutexes discarded ultimately.  Until that moment arrives, an implementation based on template functions seems a good compromise that does not take a toll on existing mutexes.

The following is the sketchy implementation.

#include <unistd.h>
#include "tbb_stddef.h"
#include "tbb_machine.h"

namespace tbb {

template<typename Mutex>
bool try_acquire_with_timeout( Mutex& mutex, typename Mutex::scoped_lock& lock, const size_t interval_in_milli /* in milliseconds */)
{
    internal::AtomicBackoff bo;
    TBB_ASSERT( interval>0, NULL );
    size_t interval_in_micro = interval_in_milli * 1000; /* convert it to usec */
    tick_count::interval_t i_100( 0.0001 /* seconds */); // 100 micro seconds == 0.1 milli seconds = 0.1*10.0E-3
    do {
        if( lock.try_acquire( mutex ) )
            return true;

        if( !bo.bounded_pause() ) {
            this_tbb_thread::sleep(i_100); // sleep for 100 micro seconds
            interval_in_micro -= 100;
            bo.reset();
        }
    } while ( interval_in_micro>0 ) ;
    return false;
}

}

I am aware that TBB users out there for the timed mutex support may not be satisfied 100% with the proposed solution.  If you are one of those and if you think you have a convincingly good use case of timed mutex, please speak up.  We would like to hear about it.  Let your voice heard.

[1] pthread_mutex_timedlock.
http://www.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_timedlock.html

[2] The ISO C++ Standards Committee, "Standard for Programming Language C++, Working Draft." 2008-06-27. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2691.pdf

[3] A. Williams. "Thoughts on a Thread Library for C++." 2006-11-06.
http://www.open-std.org/JTC1/sc22/wg21/docs/papers/2006/n2139.html

[4] H. Hinnant. "Mutex, Lock, Condition Variable Rationale." 2007-09-09.
http://www.open-std.org/JTC1/sc22/wg21/docs/papers/2007/n2406.html

[5] Windows EnterCriticalSection. http://msdn.microsoft.com/en-us/library/ms682608(VS.85).aspx (as of Sep 17, 2008)

[6] Threads API Review Committee Report. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2516.html

[7] H.-J. Boehm. "Timed_mutex in C++0x." http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2528.html

For more complete information about compiler optimizations, see our Optimization Notice.

Comments

Raf Schietekat's picture

With a queueing mutex, this does not behave as expected.

's picture

What is the expected behavior and what is the actual behavior?

queuing_mutex::scoped_lock::try_acquire() immediately returns if the mutex cannot be acquired.

Raf Schietekat's picture

I would expect that try_acquire_with_timeout(infinity) behaves exactly like acquire().

In this case the expectation of fairness is not met.

I don't think there can be a default implementation without renaming it to spin_on_try_acquire_with_timeout.

's picture

I see your point. I think the confusion comes not from try_aquire() but rather come from queuing_mutex's acquire().
The latter does not convey that it has the semantics of "fair acquire". What I mean is

Given
queuing_mutex qmtx;
queuing_mutex::scoped_lock qlock;

while( !qlokc.try_acquire(qlock) ) ;

and

qlock.acquire(qlock);

are not equivalent.

's picture

Convincingly good use case? Not sure if you&#39;ll agree, but you might be interested in the three kinds of applications that Scherer lists on page 10 of his thesis (http://www.cs.rice.edu/~wns1/papers/2006-scherer-thesis.pdf)

1. A thread in a soft real-time application may need to bound the time it spends
waiting for a lock. If the timeout expires, the thread can choose to announce an
error or to pursue an alternative code path that does not require the lock.

2. If a thread is preempted while holding a lock, timeout allows other threads waiting
for the lock to give up, yield the processor, and try again when rescheduled.
(This assumes there are enough non-waiting threads to keep the processor busy.)

3. In a parallel database system, timeout provides a viable strategy for deadlock
recovery. A thread that waits &#34;too long&#34; for a lock can assume that deadlock
has occurred, abort the current transaction, yield the processor, and retry when
rescheduled.

On another point, this try_acquire_with_timeout function has the advantage of being general. But on the other hand, it lacks fairness (as has been said) and it can cause the calling thread to be asleep when the mutex becomes available. I.e. the thread won&#39;t always gain the mutex as early as possible. Lastly, I don&#39;t think a thread can really sleep for 100 micro seconds on Win32, so the timing would only be approximate on that platform.

Andrey Marochko (Intel)'s picture

Thanks for interesting use cases, Jim. But I believe that traditional locking APIs (acquire, acquire_with_spin_count, try_acquire) would be more suitable in some of the situations you mentioned. Let&#39;s consider them case by case.

1. If the thread has some alternative work that can be done before proceeding with the code under the lock, then why wait at all? Just try to acquire the lock, and if failed, go ahead with the moonlighting. Then repeat it all again.

2. This scenario is exactly what Windows critical section with the spin count is for. It first checks if the lock is free, and if not, spins for some time. If the lock has not become available after the speciifed number of attempts it goes to wait into the kernel. But no timeout is necessary here.

Thus in these two cases more simple and straightforward APIs seem to work better.

The part of #1 with the thread announcing failure and #3 are actually very close since both assume the ability to roll-back actions done before the failure point. Technically having a mutex with timeout looks quite appealing in such situations. But I have uneasy feeling that this approach is methodologically wrong.

If an application cannot guarantee that it is deadlock-free, than it definitely has a problem that should be fixed. Of course there is a case of plug-in architectures, but even here there are better ways. For example non-reliable plug-ins (those not developed or certified by the host program owner) can be run in a separate thread pool (if not in a separate process). If someone&#39;s piece of code hung, there is only a vanishingly small chance that it will ever &#34;unhang&#34;. So backing off on timeout expecting that the next time everything will be all right seems to be just a futile hope most of the time.

What is left then is to cosider timeouts as a workaround. But I do not think that the library should encourage using workarounds instead of solving the root problem. Besides conventional try_acquire call can be easily converted into acquire_with_timeout by those who desperately needs it.