Optimizing Software Applications for NUMA: Part 7 (of 7)


NUMA, or Non-Uniform Memory Access, is a shared memory architecture that describes the placement of main memory modules with respect to processors in a multiprocessor system. The advantage of the NUMA architecture as a hierarchical shared memory scheme is its potential to improve average case access time through the introduction of fast, local memory. To realize the potential of NUMA systems, however, careful data placement is needed. The more data can effectively be placed in memory local to the processor that needs it, the more overall access time will benefit from the architecture.

We have described various strategies and considerations for ensuring optimal data placement within a NUMA-based system. In particular, we have discussed the role of processor affinity, memory allocation strategies that use implicit operating system page placement policies, and the use of the system API’s for assigning and migrating memory pages using explicit directives.

For more complete information about compiler optimizations, see our Optimization Notice.