Towards Deterministic Multithreading in Games Part I

Download Entire Article

Download Towards Deterministic Multithreading in Games Part 1 [PDF 243KB]

Summary

Multithreading is a complex subject. It's a powerful beast, and it needs careful design to get it right. By definition, multithreading in multi-core systems is not deterministic unless there's a fine control over the OS scheduler, usually not available to the average program.

This paper, Part I in a series, covers an introduction to multithreading. Part II will cover more game-related aspects.

Introduction

This section is aimed toward those who are new to multithreading. If you have experience multithreading, you might be able to scan it and move quickly to the next section.

It's worth noting that multithreading is complex. Over time there were objects, constructs, and even algorithms that were taken for granted as multithread-safe for years, and later were discovered to be fatally flawed. For example, there was a time the C++ “volatile” keyword was thought to be enough to guarantee thread safety for variables. This isn't true.

Multithreading is the ability to run multiple threads concurrently. In multi-core systems with n cores, n threads may be running at the same time, each on a different one of the n cores. The immediate advantage is the speedup, which can theoretically be up to n*x; but it comes with a number of problems since the application may no longer run in serial mode (If C comes after B, and B after A, it's not safe to assume that C comes after A anymore).

First, let’s define some terms:

  • Scalability: For the purposes of this paper, scalability is the ratio of how much performance is gained according to the number of cores. (In a more general sense, scalability can also be defined as the ability of a program to improve its performance when you increase system resources such as the addition of CPUs, memory, or bus bandwidth.) If, after adding a new thread, there's a 2x performance improvement, then scalability is very high. In practice, scalability isn't so high and can be difficult to achieve. See Amdahl's law.
  • Atomic Operation: An atomic operation is an operation that can’t be divided. It gets executed or it doesn't, but it can't be in a partial execution state.
  • Overhead: Overhead is the effort spent in doing something that isn't focused on doing the actual job. When threading a subroutine, there will be additional processing time spent in synchronization. If this additional effort (the overhead) is higher than the gain caused by threading the job, then multithreaded code can be slower than the serialized code.
  • Over-/Full-/Under-subscription: Subscription is the number of threads running at the same time. Full-subscription means there are as many threads running as cores available (e.g., four threads in a quad-core CPU). Over-subscription is running more threads than the available number of cores (e.g., three threads in a dual-core CPU); and under-subscription means there are still CPU cores not being used because we use fewer threads than there are cores available.
For more complete information about compiler optimizations, see our Optimization Notice.