Hardware support for Locks

Locks are a problematic mechanism because they can potentially slow down the system. Sometimes you just need them, usually when working with low-level API and the lower levels of an infrastructure.

There are four basic ways for using a lock:

* Spin-lock : will retain the CPU core until a condition is met

* Atomic Operation : single operation using a predefined CPU Op-Code

* Kernel Lock : such as MUTEX which can be automatically unlocked if the thread terminates

* Fast-Lock : such as Critical Section which is light-weight and is more sensitive to bugs

Locks are implemented internally by either : preventing tasks from switching / disabling Interrupts / Internally using an Atomic Operation.

Atomic Operations use an internal lock. This lock is system wide and every time a core uses an Atomic Operation it slows down all the other cores. There is similar behavior with disabling Interrupts. This is because the CPU cannot know what it is extactly that we are locking. The lock object is not really connected to the resource / buffer, so the system uses the global lock.

Today that we have things like NUMA which identifies different RAM modules I would expect to also have some form of hardware acceleration for locks.

First of all there is no reason to use a global lock if different cores access different physical RAM modules.

There is also no reason to use a CPU wide lock if two cores are running completely different applications and will never use the same lock objects. The CPU hardware can have some acceleration in which lock objects such as Critical Section and un-named MUTEX will only use the CPU wide lock if the same process is running on two different cores. Otherwise the lock should be internal to the Core.

If I could go too far I would even have the lock object related to the buffer in hardware table and only the thread that has the lock will have read / write access permission and the other threads or processes will have no page access permissions.

Which ever the solution may be, there should be some hardware support for lock objects. Memory mapped files are just not enough.

For more complete information about compiler optimizations, see our Optimization Notice.