A spin-wait loop is a technique used in multithreaded applications whereby one thread waits for other threads. The wait can be required for protection of a critical section, for barriers, or for other necessary synchronizations. Typically, the structure of a spin-wait loop consists of a loop that compares a synchronization variable with a predefined value.




On a system with Hyper-Threading Technology-enabled processors, the consumption of execution resources without contribution to any useful work can negatively impact overall application performance.

On a processor with a super-scalar speculative execution engine, a fast spin-wait loop results in the issue of multiple read requests by the waiting thread as it rapidly goes through the loop. These requests potentially execute out-of-order. When the processor detects a write by one thread to any read of the same data that is in progress from another thread, the processor must guarantee that no violations of memory order occur. To ensure the proper order of outstanding memory operations, the processor incurs a severe penalty.




One common sequence that the processor frequently executes out of order is the spin wait. This tight loop generally consists of a handful of assembly instructions written here in pseudo-code:



load x into a register

compare to 0

if not equal, goto top_of_loop

else . . .




Inserting the pause instruction can be done in one of two ways. With embedded assembly language, it is simply:







Using the intrinsics in the Intel C++ compiler and newer versions of the Microsoft C/C++ compiler, the instruction is _mm_pause(). For example, a tight loop might be:


while ( x != synchronization_variable )



1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.