Describes a process for measuring memory bandwidth for the Intel® Core™ i7 Processor Family.
memory bandwidth
Dissecting STREAM benchmark with Intel® Performance Counter Monitor
Intel® Performance Counter Monitor (Intel® PCM) is an API and a set of tools that should help developers to understand how their applications utilize the underlying compute platform. In this blog I will explain how to instrument the well-known STREAM benchmark with library functions of Intel® PCM reading statistics directly from integrated memory controllers available on the latest Intel® Xeon® 5500, 5600, 7500 and Core™ processor series.
使用Intel性能调试工具测量Xeon® 5500 Series平台上内存访问的带宽
基于新一代Nehalem架构的Intel® Xeon®5500处理器改变了传统的FSB(front-side-bus)设计,使用的是NUMA(non-uniform memory access) 架构以增强内存访问的带宽。
以前写过一篇如何在Nehalem架构上用Intel® VTune™ Performance Analyzer性能计数器MEM_UNCORE_RETIRED.REMOTE_DRAM测量多线程应用程序中由于等待IMC(Integrated memory controller)与 I/O Hub交互造成的额外开销。详见 - /zh-cn/blogs/2009/12/25/intel-core-i7-processor-numa
但有时我们关心的是程序运行的一段时间内,内存访问的带宽饱和情况。这会对程序的性能有很大的影响。
在Core(TM)2 Duo 的架构下,我们可以用如下公式得到内存带宽的数据:
