Performance Impact When Sampling Certain LLC Events on SNB-EP with Intel VTune Amplifier XE Performance Profiler

There is a performance impact while sampling any of the following events using the Intel® VTune™ Amplifier XE performance profiler on an Intel® Xeon® E5-2600 based system.  During sampling of one or more of the events below, the latency of data loads (from your application or other programs running on the same system) from memory will be increased by up to 10ns, and from the last-level cache by up to 2ns.  This performance impact is due to the way in which the events are sampled, and will be present during the sampling run while one or more of the events below is being sampled.  Please be aware that your overall performance data will be affected.  The counts of the below events will be accurate, but CPI, CPU_CLK_UNHALTED.THREAD, and other metrics related to latency may increase for any data collections that include any of these events. 

  • MEM_LOAD_UOPS_RETIRED.LLC_HIT
  • MEM_LOAD_UOPS_RETIRED.LLC_MISS
  • MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS
  • MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT
  • MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HITM
  • MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_NONE
  • MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM
  • MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_DRAM

  • Several of these events are present in VTune Amplifier XE's "General Exploration" profile for Xeon E5-2600 family processors.  The profile can be used, with the impacted events, for characterizing application performance and identifying potential issues.   If more accurate timing information (in terms of CPU_CLK_UNHALTED) is needed, please use the results of the "Lightweight Hotspots" analysis type in conjunction with the General Exploration profile.

    Note the performance impact is not present on single-socket systems based on Intel® microarchitecture code name Sandy Bridge.
    For more complete information about compiler optimizations, see our Optimization Notice.