Optimizing Software Applications for NUMA: Part 3 (of 7)

2. NUMA Advantages and Risks

The advantage of the NUMA shared memory architecture is its potential to reduce memory access time in the average case. By providing each node with its own local memory, memory accesses can take place in parallel and avoid throughput limitations and contention issues associated with a shared memory bus. In fact, memory constrained systems can theoretically improve their performance by up to the number of nodes on the system. For example, a memory-constrained dual processor system could conceivably double its performance if processors could access memory in a fully parallelized manner.

The downside of the NUMA architecture, however, is the cost associated when data is not local to the processor. In the NUMA model, the time required to retrieve data from an adjacent node within the NUMA model will be significantly higher than that required to access local memory. Furthermore, the time required to retrieve data from a non-adjacent node may be even higher, complicating memory performance and generating a hierarchy of access time possibilities. In general, as the distance from a processor increases, the cost of accessing memory increases.(2)

The key issue in determining whether the performance benefits of the NUMA architecture can be realized, then, is data placement. The more data can effectively be placed in memory local to the processor that needs it, the move overall access time will benefit from the architecture. Conversely, the more data fails to be local to the node that will access it, the more memory performance will suffer from the architecture. For this reason, the NUMA architecture can be said to provide the potential to reduce overall memory access times. To realize this potential, strategies are needed to ensure smart data placement. An application that effectively manages such placement is one that has been “optimized for NUMA”, is “NUMA-aware”, or is “NUMA-friendly”.

References:
(2) Intel® 64 and IA-32 Architectures Optimization Reference Manual. See Section 8.8 on “Affinities and Managing Shared Platform Resources”. March 2009.
Categorias:
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.