Intel® Performance Bottleneck Analyzer (Archived)

Back End 2 Uops Executed

TITLE: Back End 2 Uops Executed

ISSUE_NAME: Backend^CoreBound^Cycles1PortsUtilized

DESCRIPTION:

Cycles where the core executed a total of 2 uops

RELEVANCE:

This metric represents how often the core executed a total of 2 uops in a cycle.  Depending on the instruction mix, when 2 uops are executed, this could indicate a bottleneck since the maximum bandwidth of uops in execution was not achieved.

EXAMPLE:

Back End 1 Uop Executed

TITLE: Back End 1 Uop Executed

ISSUE_NAME: Backend^CoreBound^Cycles1PortsUtilized

DESCRIPTION:

Cycles where the core executed 1 uop

RELEVANCE:

This metric represents how often the core executed 1 uop in a cycle.  Depending on the instruction mix, when only 1 uop is executed, this may indicates a bottleneck since the maximum bandwidth of uops in execution was not achieved.

EXAMPLE:

SOLUTION:

Back End 0 Uops Executed

TITLE: Back End 0 Uops Executed

ISSUE_NAME: Backend^CoreBound^Cycles0PortsUtilized

DESCRIPTION:

Cycles where the core executed no uops

RELEVANCE:

This metric represents how often the core executed no uops.  When no uops are executed, this indicates a stall occurred somewhere preventing execution.

EXAMPLE:

SOLUTION:

If this metric is high, you will want to look at the other Backend metrics to try and root cause why the core executed no uops

RELATED_SOURCES:

NOTES:

Back End Core Bound

TITLE: Back End Core Bound

ISSUE_NAME: Backend^CoreBound

DESCRIPTION:

Cycles the back end is bound on core non-memory issues (i.e. Out of Order (OOO) resource and execution)

RELEVANCE:

This metric represents how often the pipeline was back end bound on core non-memory issues.  This may indicate that you have run out of OOO resources or are saturating certain execution units (e.g. the use of FP-chained long-latency arithmetic operations) which can limit performance.

The equation to calculate Core Bound is:

Back End Bound Due To Latency Caused In The Uncore

TITLE: Back End Bound Due To Latency Caused In The Uncore

ISSUE_NAME: Backend^MemBound^UncoreBound

DESCRIPTION:

Cycles the Memory is bound on Uncore (i.e. anything outside the processor core: LLC, Memory, Ring, etc.)

RELEVANCE:

This metric represents how often the pipeline was back end bound on the Uncore (i.e. anything outside the processor core: LLC, Memory, Ring, etc.).  Avoiding cache misses (i.e. L2 misses) will improve the latency and increase performance.

EXAMPLE:

Back End Bound Due To Latency Caused By L2 Cache

TITLE: Back End Bound Due To Latency Caused By L2 Cache

ISSUE_NAME: Backend^MemBound^L2Bound

DESCRIPTION:

Cycles the back end was bound on the L2 cache

RELEVANCE:

This metric represents how often the pipeline was back end bound on the L2 cache.  Avoiding cache misses (i.e. L1 misses/L2 hits) will improve the latency and increase performance.

EXAMPLE:

For instance, if you have many L1 misses and hit in the L2 cache, you would see a high percentage of back end memory bound percentage in L2.

SOLUTION:

RELATED_SOURCES:

Back End Bound Due To Latency Caused By L1 Data Cache

TITLE: Back End Bound Due To Latency Caused By L1 Data Cache

ISSUE_NAME: Backend^MemBound^L1Bound

DESCRIPTION:

This metric describes the cycles the back end was bound on the L1 data cache.  The L1 cache typically has the shortest latency.  However, in certain cases like loads blocked on older stores, a load might suffer a high latency even though it is being satisfied by the L1. There are no fill-buffers allocated for L1 hits so instead we use the load matrix (LDM) stalls sub-event as it accounts for any non-completed load.

Bad Speculation Speculative Uops

TITLE: Bad Speculation Speculative Uops

ISSUE_NAME: BadSpeculation^SpeculativeUops

DESCRIPTION:

The percentage of pipeline slots wasted due to miss-speculated uops

RELEVANCE:

When there is significant bad speculation, this metric can help to determine the impact incurred from miss-speculated uops.

EXAMPLE:

For instance, if you had branch mispredicts or nukes (e.g. memory order nukes from unsucessful memory disambiguation), you would see a high speculative uops percentage.

SOLUTION:

RELATED_SOURCES:

NOTES:

Bad Speculation Recovery Stalls

TITLE: Bad Speculation Recovery Stalls

ISSUE_NAME: BadSpeculation^RecoveryStalls

DESCRIPTION:

The percentage of pipeline slots wasted due to recovery from miss-speculated state (jeclears, nukes, etc.)

RELEVANCE:

When there is significant bad speculation, this metric can help to determine the impact incurred from recovering the speculative machine state.

EXAMPLE:

SOLUTION:

RELATED_SOURCES:

NOTES:

EQUATION:  (INT_MISC.RECOVERY_CYCLES * 4) / CPU_CLK_UNHALTED.THREAD

Intel® Performance Bottleneck Analyzer (Archived) abonnieren