Performance Improvement Opportunities with NUMA Hardware

Intel Corporation’s non-uniform memory access (NUMA) strategy is based on several new memory technologies that promise significant improvements in both capability and performance:

  • Multi-Channel DRAM (MCDRAM) and High-Bandwidth Memory (HBM)
  • Non-volatile dual inline-memory modules (NVDIMMs)
  • Intel® Omni-Path Fabric (Intel® OP Fabric)

Software that works with the memory system’s performance characteristics has often been more efficient than software that disregards them. Software written for these new memory technologies will be no exception: Using the memory subsystem efficiently may result in a 10x or more performance improvement. Software modified to take advantage of non-volatile memory – memory that survives power power-off and many system failures – may also result in significant performance speedups.

These new technologies are already influencing the design of software and hardware, ranging from one-person servers to the largest systems, such as Aurora.

MCDRAM and HBM

MCDRAM is proprietary, high-bandwidth memory that physically sits atop Intel® Xeon Phi™ processors code named Knights Landing.

HBM, which is compatible with JEDEC standards, is high-bandwidth memory designed for Intel Xeon Phi processors code named Knights Hill.

NOTE: Some people refer to both MCDRAM and HBM as high-bandwidth memory.

Non-volatile DIMMs

There are several technologies for building dual inline-memory modules (DIMMs) whose contents survive power failures. Such NVDIMMs have different capacity, power consumption, and read-write performance than double data rate (DDR) DIMMs. Memory that survives system power-off or failure enable you to persist data, while accessing that persistent data at memory subsystem speeds.

A future possibility is DIMMs built using 3D XPoint™ technology.

Intel OP Fabric

Unlike MCDRAM, HBM, and NVDIMM technologies, Intel OP Fabric technology connects processors to memory devices, whether those devices are beside the processors or many miles away. It simplifies the creation of parallel applications with an address space spread across many systems.

Summary

These new hardware technologies will enable some existing applications to run faster without changes; however, the potential improvements are even greater if you are willing to change your software. The next article, NUMA Hardware Target Audience, helps you understand how these technologies apply to you.

About the Author

Bevin Brett is a Principal Engineer at Intel Corporation, working on tools to help programmers and system users improve application performance. During his free time, he refurbishes mechanical calculators from the 1920s or responds to medical emergencies as a 911 paramedic.

Resources

For more complete information about compiler optimizations, see our Optimization Notice.