User Guide

Contents

Set Up Environment to Analyze GPU Kernels

To analyze performance of GPU kernels in your application, you need to enable relevant permissions.
GPU Roofline model analyzes and visualizes GPU kernel performance using benchmarks and hardware metric profiling.
You are recommended to run the Survey step of the GPU Roofline with
root
privileges on Linux OS or as an
Administrator
on Windows* OS.
If you
do not
have root permissions on Linux OS, enable collecting GPU hardware metrics for non-privileged users as follows:
  1. Add your username to the video group.
    1. To check if you are already in the video group, run:
      groups | grep video
    2. If you are not part of the video group, add your username to it:
      sudo usermod -a -G video <username>
    3. Type
      groups
      to verify that you successfully added your username to the video group . If video is not listed, log out and log back in.
  2. Set the value of the
    dev.i915.perf_stream_paranoid sysctl
    option to
    0
    :
    sysctl -w dev.i915.perf_stream_paranoid=0
    This command makes a temporary-only change that is lost on the next reboot. To change this option permanently, run:
    echo dev.i915.perf_stream_paranoid=0 > /etc/sysctl.d/60-mdapi.conf
  3. Disable time limit in order to run OpenCL™ kernel for a longer period of time. Do one of the following:
    • Run the command:
      sudo sh -c "echo N> /sys/module/i915/parameters/enable_hangcheck"
    • Append
      i915.enable_hangcheck=0
      to
      GRUB_CMDLINE_LINUX_DEFAULT
      in the
      /etc/default/grub
      directory. Run the following command to update the configuration:
      sudo update-grub
Continue to set up a project if you do not have one and run the GPU Roofline Insights perspective to analyze GPU kernels in your application.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.