3.3 Data Placement Using Explicit Memory Allocation Directives
Another approach to data placement in NUMA-based systems is to make use of system APIs that explicitly configure the location of memory page allocations. An example of such APIs is the libnuma library for Linux.
3.2. Data Placement Using Implicit Memory Allocation Policies
In the simple case, many operating systems transparently provide support for NUMA-friendly data placement. When a single-threaded application allocates memory, the processor will simply assign memory pages to the physical memory associated with the requesting thread’s node (CPU package), thus insuring that it is local to the thread and access performance is optimal.
3. Strategies for NUMA Optimization
Two key notions in managing performance within the NUMA shared memory architecture are processor affinity and data placement.
3.1. Processor Affinity
1. The Basics of NUMA
NUMA, or Non-Uniform Memory Access, is a shared memory architecture that describes the placement of main memory modules with respect to processors in a multiprocessor system. Perhaps the best way to understand NUMA is to compare it with its cousin UMA, or Uniform Memory Access.
In the UMA memory architecture, all processors access shared memory through a bus (or another type of interconnect) as seen in the following diagram:
- Page 1