User Guide

Contents

Set Up Environment to Analyze GPU Kernels

To analyze performance of GPU kernels in your application with GPU Roofline, you need to enable relevant permissions.
GPU Roofline model analyzes and visualizes GPU kernel performance using benchmarks and hardware metric profiling.
On Windows* OS, run the Survey step of the GPU Roofline as an
Administrator
.
On Linux* OS, run the Survey step of the GPU Roofline with
root
privileges.
If you
do not
have root permissions on Linux OS, enable collecting GPU hardware metrics for non-privileged users as follows:
  1. Add your username to the video group.
    1. To check if you are already in the
      video
      group, run:
      groups | grep video
    2. If you are not part of the
      video
      group, add your username to it:
      sudo usermod -a -G video <username>
    3. Type
      groups
      to verify that you successfully added your username to the video group . If video is not listed, log out and log back in.
  2. For Ubuntu* 19.10 and higher: Add your username to the
    render
    group.
    1. To check if you are already in the
      render
      group, run:
      groups | grep render
    2. If you are not part of the
      render
      group, add your username to it:
      sudo usermod -a -G render <username>
    3. Type
      groups
      to verify that you successfully added your username to the
      render
      group . If
      render
      is not listed, log out and log back in.
  3. Set the value of the
    dev.i915.perf_stream_paranoid sysctl
    option to
    0
    :
    sysctl -w dev.i915.perf_stream_paranoid=0
    This command makes a temporary-only change that is lost on the next reboot. To change this option permanently, run:
    echo dev.i915.perf_stream_paranoid=0 > /etc/sysctl.d/60-mdapi.conf
  4. Disable time limit in order to run OpenCL™ kernel for a longer period of time. Do one of the following:
    • To disable the time limit
      temporarily
      until the next reboot, run the command:
      sudo sh -c "echo N> /sys/module/i915/parameters/enable_hangcheck"
    • To disable the time limit
      permanently
      , append
      i915.enable_hangcheck=0
      to
      GRUB_CMDLINE_LINUX_DEFAULT
      in the
      /etc/default/grub
      directory. Run the following command to update the configuration:
      sudo update-grub
Continue to set up a project if you do not have one and run the
GPU Roofline Insights
perspective to analyze GPU kernels in your application.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.