CPI Rate
Metric Description
Cycles per Instruction Retired, or CPI, is a
fundamental performance metric indicating approximately how much time each
executed instruction took, in units of cycles. Modern superscalar processors
issue up to four instructions per cycle, suggesting a theoretical best CPI of
0.25. But various effects (long-latency memory, floating-point, or SIMD
operations; non-retired instructions due to branch mispredictions; instruction
starvation in the front-end) tend to pull the observed CPI up. A CPI of 1 is
generally considered acceptable for HPC applications but different application
domains will have very different expected values. Nonetheless, CPI is an
excellent metric for judging an overall potential for application performance
tuning.
Possible Issues
The CPI may be too high. This could be caused by issues such as memory
stalls, instruction starvation, branch misprediction or long latency
instructions. Explore the other hardware-related metrics to identify what is
causing high CPI.