Metric Description

This metric represents a fraction of cycles during which an application could be stalled due to approaching bandwidth limits of the main memory (DRAM). This metric does not aggregate requests from other threads/cores/sockets (see Uncore counters for that). Consider improving data locality in NUMA multi-socket systems.

Possible Issues

A significant fraction of cycles were stalled due to to approaching bandwidth limits of the main memory (DRAM).


Improve data accesses to reduce cacheline transfers from/to memory using these possible techniques:

  • Consume all bytes of each cacheline before it is evicted (for example, reorder structure elements and split non-hot ones).

  • Merge compute-limited and bandwidth-limited loops.

  • Use NUMA optimizations on a multi-socket system.


Software prefetches do not help a bandwidth-limited application.

