This document contains advance information. While every effort has been made to ensure the accuracy of the information contained herein, some errors may occur. Please contact firstname.lastname@example.org if you have questions or comments.
This document describes the programming interface to the performance monitoring hardware on the Nehalem processor core and the Nehalem-EP (Gainstown) uncore. This document does not exhaustively describe all of the performance monitoring events which may be counted in the Nehalem core or Nehalem-EP (Gainstown) uncore. A detailed description of these events may be released separately.
About this document
This is a programmer's reference manual for the Nehalem core and Nehalem-EP performance monitoring units (PMU). This is targeted for current tool owners requiring documentation updates for Nehalem based platforms. It is not intended for first time tool developers or as a user analysis guide. Additional documents will be available at a later date targeted at providing that information.
Nehalem-based PMU Architecture
Intel processor cores for many years included a Performance Monitoring Unit (PMU). This unit provided the ability to count the occurrence of micro-architectural events which expose some of the inner workings of the processor core as it executes code.
One usage of this capability is to create a list of events from which certain performance metrics can be calculated. Software configures the PMU to count events over an interval of time and report the resulting event counts. Using this methodology, performance analysts can characterize overall system performance.
The PMU also provides facilities to generate a hardware interrupt through the Local APIC integrated within the processor core or logical thread. In this case software can pre-load event counter registers with a "sample after value," in which case a hardware interrupt is generated after the occurrence of N number of events. In the interrupt handler software collects additional architectural state which provides analysts with information regarding the performance of specific areas of application code. This methodology is sometimes referred to as profiling the execution of an application.
Products based on the Nehalem processor core include the capability to collect event data under both of these scenarios. In addition, these products include various platform features (uncore) integrated on the same die as the processor core.