TITLE: Back End Bound Due To Latency Caused By L1 Data Cache
ISSUE_NAME: Backend^MemBound^L1Bound
DESCRIPTION:
This metric describes the cycles the back end was bound on the L1 data cache. The L1 cache typically has the shortest latency. However, in certain cases like loads blocked on older stores, a load might suffer a high latency even though it is being satisfied by the L1. There are no fill-buffers allocated for L1 hits so instead we use the load matrix (LDM) stalls sub-event as it accounts for any non-completed load.
The LDM_PENDING sub-event is new for Intel microarchitecture codename IvyBridge and not only identifies when these stalls matter, but supplies an upper bound of the overall L1 possible stalls should there be a block type that is not covered by legacy counters.
Back End Bound Due To Latency Caused By L1 Data Cache
TITLE: Back End Bound Due To Latency Caused By L1 Data Cache
ISSUE_NAME: Backend^MemBound^L1Bound
DESCRIPTION:
This metric describes the cycles the back end was bound on the L1 data cache. The L1 cache typically has the shortest latency. However, in certain cases like loads blocked on older stores, a load might suffer a high latency even though it is being satisfied by the L1. There are no fill-buffers allocated for L1 hits so instead we use the load matrix (LDM) stalls sub-event as it accounts for any non-completed load.
L1 Bound: (CYCLE_ACTIVITY.STALLS_LDM_PENDING - CYCLE_ACTIVITY.STALLS_L1D_PENDING)/ CPU_CLK_UNHALTED.THREAD
The LDM_PENDING sub-event is new for Intel microarchitecture codename IvyBridge and not only identifies when these stalls matter, but supplies an upper bound of the overall L1 possible stalls should there be a block type that is not covered by legacy counters.
RELEVANCE:
EXAMPLE:
SOLUTION:
RELATED_SOURCES:
NOTES:
EQUATION: (CYCLE_ACTIVITY.STALLS_LDM_PENDING-CYCLE_ACTIVITY.STALLS_L1D_PENDING) / CPU_CLK_UNHALTED.THREAD