Intel PCM Column Names Decoder Ring

When Intel Performance Counter Monitor (Intel PCM) is generating csv files as output, short names are used as column headers. This helps to keep the table width at a manageable size if the data is loaded in a spreadsheet program. However, it makes it rather hard to guess what exactly is hiding behind these abbreviations. Since I'm getting a lot of questions on how to interpret these column names, I've put together a decoder ring:

FieldExplanationExample
The following metrics are available on all levels:
DateDay-Month-Year05-02-14
TimeTime of day13:38:04
EXECInstructions per nominal CPU cycle, i.e. in respect to the CPU frequency ignoring turbo and power saving0.182
IPCInstructions per cycle. This measures how effectively you are using the core.0.159
FREQFrequency relative to nominal CPU frequency (“clockticks”/”invariant timer ticks”)1.143
AFREQFrequency relative to nominal CPU frequency excluding the time when the CPU is sleeping1.143
L3MISSL3 cache line misses in millions182.879
L2MISSL2 cache line misses in millions356.3
L3HITL3 Cache hit ratio (hits/reference)0.487
L2HITL2 Cache hit ratio (hits/reference)0.233
L3CLKVery rough estimate of cycles lost to L3 cache misses vs. clockticks0.044
L2CLKVery rough estimate of cycles lost to L2 cache misses vs. clockticks0.008
The following metrics are only available on socket and system level:
READMemory read traffic on this socket in GB23.108
WRITEMemory read traffic on this socket in GB10.782
The following metrics are only available on a socket level:
Proc Energy (Joules)The energy consumed by the processor in Joules. Divide by the time to get the power consumption in watt122.457
DRAM Energy (Joules)The energy consumed by the DRAM attached to this socket in Joules. Divide by the time to get the power consumption in watt115.747
TEMPThermal headroom in Kelvin (max design temperature – current temperature)32
The following metrics are only available on a system level:
INSTNumber of instructions retired119706
ACYCNumber of clockticks, This takes turbo and power saving modes into account.750640.8
TIME(ticks)Number of invariant clockticks. This is invariant to turbo and power saving modes.2817.883
PhysIPCInstructions per cycle (IPC) multiplied by number of threads per core. See section "Core Cycles-per-Instruction (CPI) and Thread CPI" in Performance Insights to Intel® Hyper-Threading Technology for some background information.0.319
PhysIPC%Instructions per cycle (IPC) multiplied by number of threads per core relative to maximum IPC7.974
INSTnomInstructions per nominal cycle multiplied by number of threads per core0.365
INSTnom%Instructions per nominal cycle multiplied by number of threads per core relative to maximum IPC. The maximum IPC is 2 for Atom and 4 for all other supported processors.9.113
TotalQPIinQPI data traffic estimation (data traffic coming to CPU/socket through QPI links) in MB (1024*1024)21937.96
QPItoMCRatio of QPI traffic to memory traffic0.632
TotalQPIoutQPI traffic estimation (data and non-data traffic outgoing from CPU/socket through QPI links) in MB (1024*1024)38443.3

Please also note that PCM reports absolute values for the measured time interval. For example, if you use a time interval of 5 seconds, memory traffic or instructions retired are reported for the whole 5 seconds. Only if you are executing PCM with 1 sec time interval, you will get memory traffic in GB/s.

 

For more complete information about compiler optimizations, see our Optimization Notice.

4 comments

Top
Thomas Willhalm (Intel)'s picture

"MPI" stands for "misses per instruction". The previous clock estimate turned out to be too rough and misleading it several cases. We believe that MPI is a better metric.

ANDREA G. (Intel)'s picture

Has L3CLK been replaced by L3MPI? What is the meaning of the latter?

Thanks!

GA I.'s picture

Why I/0 requests are not logged in CSV file ?
NTCWS

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.