Profiling an FPGA-driven DPC++ Application
- All the Cookbook recipes are scalable and can be applied to Intel VTune Amplifier 2018 and higher. Slight version-specific configuration changes are possible.
- Intel® VTune™ Amplifier has been renamed to Intel® VTune™ Profiler starting with its version for Intel® oneAPI Base Toolkit (Beta). You can still use a standalone version of the VTune Profiler, or its versions integrated into Intel Parallel Studio XE or Intel System Studio.
- Operating system: Linux* OS (Ubuntu* 18.04)
- CPU: Intel server platform code-named Cascade Lake
- FPGA: Intel® Programmable Acceleration Card (Intel® PAC) with Intel® Arria® 10 GX FPGA or Intel® Stratix 10 GX FPGA PAC board for DPC++ (with installable add-on)
Install and Configure the Toolkit
- Plug the Intel PAC card into the PCIe slot on the machine.
- Unzip the FPGA add-on package and runsetup.sh. Select all default options.
- Set up the oneAPI environment.source <oneAPI-install-dir>/setvars.sh
- Install the FPGA board.aocl install
- Run the diagnose command to ensure that all diagnostics pass.aocl diagnose
Build the Sample Application
- Open thecrrsample folder.cd BaseKit-code-samples/FPGAExampleDesigns/crr
- Open thesrc/CMakeLists.txtfile.
- Locate the line of code that lists hardware flags. It should start withset(HARDWARE_LINK_FLAGS.
- Add-Xsprofileto the set of flags.
- Go back to the main directory for the sample. Create a new folder calledbuildand open it.mkdir build cd build
- Compile the sample.
This process can take several hours. Once it has finished, you should have an executable file calledcmake .. make fpgacrr.fpga.
Run CPU/FPGA Interaction Analysis
- Launch VTune Profiler and clickNew Projectfrom the Welcome page.TheCreate a Projectdialog box opens.
- Specify a project name, a location for your project, and clickCreate Project.TheConfigure Analysiswindow opens.
- In theWHEREpane, selectLocal Host.
- In theWHATpane, selectLaunch Applicationas the target.
- In theApplicationfield, specify the path to thecrr.fpgaexecutable.
- In theApplication parametersfield, enterordered_inputs.csv.
- In theHOWpane, selectCPU/FPGA Interaction (preview)from thePlatform Analysisgroup.
- In the analysis settings, selectAOCL Profilerfor theFPGA profiling data source.
- ClickStartat the bottom to run the analysis.
- FPGA top compute tasks
- Top tasks and hotspots for the CPU
- Data transfer size
- Average bandwidth for transferred data
- Start/end times
- Overtime stalls
- Bandwidth metrics