User Guide


GPU Roofline Insights

Use the
chart to answer the following questions:
  • What is the maximum achievable performance with your current hardware resources?
  • Does your application work optimally on current hardware resources?
  • If not, what are the best candidates for optimization?
  • Is memory bandwidth or compute capacity limiting performance for each optimization candidate?
To run the
GPU Roofline Insights
  1. Prerequisite:
    Set up environment to analyze GPU kernels.
  2. Choose collection accuracy level to select perspective steps and set analysis properties, depending on the desired results:
    • Low
      : Model your application performance for a target device and get the basic information about potential speed-up and performance.
    • Medium
      : Model your application performance and data transfers between host and target devices.
    • High
      : Model your application performance and data transfers and detect parallel regions to extend list of offload candidates.
    • Custom
      : Customize the perspective flow and properties.
    By default, accuracy is set to
    . For more info, see GPU Roofline Accuracy Presets.
    • For GPU Roofline, the accuracy level controls the complexity of the
      CPU Roofline
      chart generated for loops/functions in your code executed on CPU. If you are interested only in code regions executed on GPU, select
    • The higher accuracy value you choose, the higher runtime overhead is added to your application.
  3. Run the perspective: click button.
    While the perspective is running, you can do the following in the
    Analysis Workflow
    • Control the perspective execution:
      • Stop data collection and see the already collected data: Click the button.
      • Pause data collection: Click the button.
      • Cancel data collection and discard the collected data: Click the button.
    • Expand an analysis with to control the analysis execution:
      • Pause analysis and see the already collected data: Click the button.
      • Stop analysis and start the next analysis selected: Click the button.
      • Interrupt execution of all selected analyses and see the already collected data: Click the button.
    You can generate command lines for selected perspective configuration by clicking the
    Command Line
    button. For CLI wokflow example, see Command Line Use Cases.
Intel® Advisor
generates a GPU Roofline report. Continue to examine GPU bottlenecks on the Roofline chart to investigate the results.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at