Intel® Microarchitecture Codename Nehalem Performance Monitoring Unit Programming Guide - Uncore

By Thomas M Johnson, Published: 03/15/2011, Last Updated: 03/15/2011

Download Article


This document contains advance information. While every effort has been made to ensure the accuracy of the information contained herein, some errors may occur. Please contact if you have questions or comments.

This document describes the programming interface to the performance monitoring hardware on the Nehalem processor core and the Nehalem-EP (Gainstown) uncore. This document does not exhaustively describe all of the performance monitoring events which may be counted in the Nehalem core or Nehalem-EP (Gainstown) uncore. A detailed description of these events may be released separately.

About this document

This is a programmer's reference manual for the Nehalem core and Nehalem-EP performance monitoring units (PMU). This is targeted for current tool owners requiring documentation updates for Nehalem based platforms. It is not intended for first time tool developers or as a user analysis guide. Additional documents will be available at a later date targeted at providing that information.

Nehalem-based PMU Architecture

Intel processor cores for many years included a Performance Monitoring Unit (PMU). This unit provided the ability to count the occurrence of micro-architectural events which expose some of the inner workings of the processor core as it executes code.

One usage of this capability is to create a list of events from which certain performance metrics can be calculated. Software configures the PMU to count events over an interval of time and report the resulting event counts. Using this methodology, performance analysts can characterize overall system performance.

The PMU also provides facilities to generate a hardware interrupt through the Local APIC integrated within the processor core or logical thread. In this case software can pre-load event counter registers with a "sample after value," in which case a hardware interrupt is generated after the occurrence of N number of events. In the interrupt handler software collects additional architectural state which provides analysts with information regarding the performance of specific areas of application code. This methodology is sometimes referred to as profiling the execution of an application.

Products based on the Nehalem processor core include the capability to collect event data under both of these scenarios. In addition, these products include various platform features (uncore) integrated on the same die as the processor core.

The uncore is essentially everything on the processor chip that is not part of the core. This includes point-to-point interconnect logic, memory controllers, and last-level caches, among other things. The uncores also provide an additional PMU facility that has the ability to interrupt the processor core in order that profiling information may be collected. This document describes the Nehalem processor core PMU, and the Nehalem-EP uncore PMU.

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804