Memory and Cache Profiling Erratum on Intel® Xeon® processor E5 family

Audience: Anyone collecting event based performance data on a platform based on the Intel® Xeon® processor E5 family.

There is a Performance monitoring unit erratum on the Intel®  Xeon® processor E5  family that affects the events used for memory and cache profiling. To collect data on the events listed below in Table 1, a workaround must be applied in either regular or Precise mode (PEBS). This workaround will fix the data source encoding information that is stored in the PEBS record for the Load Latency events as well as ensure correct counts for each event.

The workaround will increase memory and L3 latencies and therefore should only be used while sampling any of these events. The counts of the below events will be accurate, but CPI, CPU_CLK_UNHALTED.THREAD, and other metrics related to latency may increase for any data collections that include any of these events.

Note this workaround is not necessary on platforms based on the 2nd Generation Intel® Core™ Processor family.

Table 1: Memory Events Requiring the Workaround*

CODE

UMASK

NAME

Description

0xD1

0x04

MEM_LOAD_UOPS_RETIRED.LLC_HIT

Retired load uops whose data source was LLC hit with no snoop required.

0xD1

0x20

MEM_LOAD_UOPS_RETIRED.LLC_MISS

Retired load uops whose data source is LLC miss

0xD3

0x01

MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM

Retired load uops whose data source was local memory

0xD3

0x04

MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_DRAM

Retired load uops whose data source was remote DRAM.

0xD2

0x01

MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS

Retired load uops whose data source was an on-package core cache LLC hit and cross-core snoop missed.

0xD2

0x02

MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT

Retired load uops whose data source was an on-package

LLC hit and cross-core snoop hits.

0xD2

0x04

MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HITM

Retired load uops whose data source was an on-package core cache with HitM responses.

0xD2

0x08

MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_NONE

Retired load uops whose data source was LLC hit with no snoop required.

0xCD

0x01

MEM_TRANS_RETIRED.LOAD_LATENCY_GT_*

Sample stores and collect precise store operation via PEBS record.

* only base event information is given all other restrictions and programming documented in the Software Developers Manual (SDM) apply accordingly.

How to sample affected events:

Several performance analysis tools automatically apply the workaround to sample affected events and remove the workaround after data collection.

Performance Tools For Windows* users:

Performance Tools For Linux* users:

For Linux Perf users:

  • Intel PMU tools provides Python scripts to enable/disable the workaround for use on top of Linux perf. See the latego.py script for more information. 
Pour de plus amples informations sur les optimisations de compilation, consultez notre Avertissement concernant les optimisations.