Compare and Swap Loop


Atomically update a scalar value so that a predicate is satisfied.


Often a shared variable must be updated atomically, by a transform that maps its old value to a new value. The transform might be a transition of a finite state machine, or recording global knowledge. For instance, the shared variable might be recording the maximum value that any thread has seen so far.


  • The variable is read and updated by multiple threads.

  • The hardware implements "compare and swap" for a variable of that type.

  • Protecting the update with a mutex is to be avoided.


  • Reduction
  • Reference Counting


The solution is to atomically snapshot the current value, and then use atomic<T>::compare_and_swap to update it. Retry until the compare_and_swap succeeds. In some cases it may be possible to exit before the compare_and_swap succeeds because the current value meets some condition.

The template below does the update x=f(x) atomically.

// Atomically perform x=f(x).
template<typename F, typename T>
void AtomicUpdate( atomic<T>& x, F f ) {
   int o;
   do {
       // Take a snapshot
       o = x;
       // Attempt to install new value computed from snapshot
   } while( x.compare_and_swap(f(o),o)!=o );

It is critical to take a snapshot and use it for intermediate calculations, because the value of X may be changed by other threads in the meantime.

The following code shows how the template might be used to maintain a global maximum of any value seen by RecordMax.

// Atomically perform UpperBound = max(UpperBound,y) 
void RecordMax( int y ) {
   extern atomic<int> UpperBound;
   AtomicUpdate(UpperBound, [&](int value){return std::max(value,y);} );

When y is not going to increase UpperBound, the call to AtomicUpdate will waste time doing the redundant operation compare_and_swap(o,o). In general, this kind of redundancy can be eliminated by making the loop in AtomicUpdate exit early if f(o)==o. In this particular case where F==std::max<int>, that test can be further simplified. The following custom version of RecordMax has the simplified test.

// Atomically perform UpperBound =max(UpperBound,y) 
void RecordMax( int y ) . .
   extern atomic<int> UpperBound;
   do {
       // Take a snapshot
       int o = UpperBound;
       // Quit if snapshot meets condition.
       if( o>=y ) break;
       // Attempt to install new value.
   } while( UpperBound.compare_and_swap(y,o)!=o );

Because all participating threads modify a common location, the performance of a compare and swap loop can be poor under high contention. Thus the applicability of more efficient patterns should be considered first. In particular:

  • If the overall purpose is a reduction, use the reduction pattern instead.

  • If the update is addition or subtraction, use atomic<T>::fetch_and_add. If the update is addition or subtraction by one, use atomic<T>::operater++ or atomic<T>::operator--. These methods typically employ direct hardware support that avoids a compare and swap loop.


When using compare_and_swap to update links in a linked structure, be sure you understand if the "ABA problem" is an issue. See the Internet for discourses on the subject.