Optimizing Software Applications for NUMA: Part 2 (of 7)

Modern Processors

Modern multiprocessor systems mix these basic architectures as seen in the following diagram:

In this complex hierarchical scheme, processors are grouped by their physical location on one or the other multi-core CPU package or “node”. Processors within a node share access to memory modules as per the UMA shared memory architecture. At the same time, they may also access memory from the remote node using a shared interconnect, but with slower performance as per the NUMA shared memory architecture.

Server platforms like Intel® Xeon using the Intel® Core i7 processors provide an example of this complex memory architecture, and for this reason our discussion will center on it henceforth. Note that such platforms employ a fast interconnect technology known as Intel® QuickPath Interconnect (QPI) to mitigate (but not eliminate) the problem of slower remote memory performance.
Pour de plus amples informations sur les optimisations de compilation, consultez notre Avertissement concernant les optimisations.