Cycles and UOps Analysis

Cycles and uOps analysis type uses event-based sampling collection and is targeted for Intel® microarchitecture code name Sandy Bridge.

Use this analysis type to identify performance issues in the core pipeline and understand the application execution flow.

Cycles and uOps analysis type also helps identify the following performance problems:

  • instruction shortage delivery (instruction starvation) due to Front End issues

  • wasted work due to speculatively dispatched operations

  • stalls in the retirement

To analyze the control flow of the application, explore the Instruction Retired event data and precise branch events data provided in the Assembly pane.

The instruction events and corresponding branch events used in this analysis type help evaluate function call counts and basic blocks execution counts.

The Cycles and uOps analysis type also uses a performance metric based on the UOPS_RETIRED.ANY/INST_RETIRED.ANY ratio. Explore the values for this ratio to identify when floating-point exception handlers are frequently invoked (the ratio >> 1) and causing performance problem.

To see the full list of events used for this analysis type:

  1. Click the New Analysis toolbar button.

    The Analysis Type window opens.

  2. From the left pane, select Microarchitecture Analysis > CPU Specific Analysis > Sandy Bridge Analysis > Cycles and uOps.

    The Cycles and uOps configuration pane opens on the right. The Details section provides a table with the processor events used for this analysis type.

You can choose to view Cycles and uOps analysis results in any of the following viewpoints:

Viewpoint

Description

Hardware Event Counts

Displays the event count for all collected processor events. While the Hardware Event Sample Counts viewpoint provides the actual number of samples collected for an event, Hardware Event Count viewpoint estimates the number of times this event occurred during the collection.

Hardware Event Sample Counts

Displays the sample count for all collected processor events. While the Hardware Event Counts viewpoint estimates the number of times an event occurred during the collection, the Hardware Event Sample Counts viewpoint provides the actual number of samples collected for this event.

Hotspots

Helps identify hotspots - code regions in the application that consume a lot of CPU time.

Task Time

Visualizes tasks, logical units of work on specific threads, based on ITT API annotations. Identify tasks with the highest execution time and analyze threads responsible for a particular task.

Each viewpoint consists of the following windows/panes:

  • Summary window displays statistics on the overall application execution.

  • Bottom-up pane displays performance data per metric (event ratio/event count/sample count) for each hotspot function.

  • Top-down Tree window displays hotspot functions in the call tree, performance metrics for a function only (Self value) and for a function and its children together (Total value).

  • Caller/Callee window displays parent and child functions of the selected focus function. This window is available only if stack collection was enabled during analysis configuration.

  • Timeline pane displays how the CPU time usage changed over application run time and how the event count for a particular event was changing over time.

  • Tasks, Tasks over Time, and Tasks by Threads windows provide details on tasks specified in your code with the Task API.

Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.