Developer Guide

Contents

Area Analysis of System

The 
<project_dir>/reports/report.html
 file contains information about area use of your DPC++ system.
The report provides the following information:
  • Detailed area breakdown of the whole DPC++ system, mapped to your source code where possible.
  • Architectural details to give insight into the generated hardware and offers actionable suggestions to resolve potential inefficiencies.
In the Reports pane's
Area Analysis
drop-down menu, select
Area Analysis of System
.
As you can observe in the following figure, the report is divided into three levels of hierarchy:
  • System area
    : Used by all kernels, pipes, interconnects, and board logic.
  • Kernel area
    : Used by a specific kernel, including overheads, for example, dispatch logic.
  • Block area
    : Used by a specific block within a kernel. A block represents a branch-free section of your source code (for example, a loop body). To view the area, use information from the source code lines associated with a block and expand the report entry for that block.
Area Analysis of System Report Hierarchy
Area Analysis of System Report Hierarchy
The area use data are estimates that the
Intel® oneAPI DPC++/C++ Compiler
 generates. These estimates might differ from the final area utilization results.

Messages in the Area Analysis of System Report

After you compile your DPC++ application, review the Area Analysis of System report that the 
Intel® oneAPI DPC++/C++ Compiler
 generates. In addition to summarizing the application’s resource use, the Area Analysis of System report offers suggestions on how to modify your design to improve efficiency. Refer to the following sections that describe in detail various messages reported in the Area Analysis of System report.
Message for Board Interface
The Area Analysis of System report identifies the amount of logic that the 
Intel® oneAPI DPC++/C++ Compiler
 generates for the Custom Platform, or board interface. The board interface is the static region of the device that facilitates communication with external interfaces such as PCIe®. The Custom Platform specifies the size of the board interface.
Message for Function Overhead
The Area Analysis of System report identifies the amount of logic that the 
Intel® oneAPI DPC++/C++ Compiler
generates for tasks such as dispatching kernels.
Message for State
The Area Analysis of System report identifies the amount of resources that your design uses for live values and control logic. To reduce the reported area consumption under State, modify your design as follows:
  • Decrease the size of local variables.
  • Decrease the scope of local variables by localizing them whenever possible.
  • Decrease the number of nested loops in the kernel.
Message for Feedback
The Area Analysis of System report specifies the resources that your design uses for loop-carried dependencies.
To reduce the reported area consumption under Feedback, decrease the number and size of loop-carried variables in your design.
Messages for Private Variable Storage
The Area Analysis of System report provides information on the implementation of private memory based on your DPC++ design. For single work-item kernels, the 
Intel® oneAPI DPC++/C++ Compiler
implements private memory differently, depending on the types of variable. The
Intel® oneAPI DPC++/C++ Compiler
implements scalars and small arrays in registers of various configurations (for example, plain registers, shift registers, and barrel shifter). The
Intel® oneAPI DPC++/C++ Compiler
implements larger arrays in block RAM.
The following table lists messages and notes of different private variable storage types:
Additional Information About Area Analysis of System Report Message
Message
Notes
Implementation of Private Memory Using On-Chip Block RAM
Private memory implemented in on-chip block RAM.
The block RAM implementation creates a system that is similar to local memory for NDRange kernels.
Implementation of Private Memory Using On-Chip Block ROM
For each use of an on-chip block ROM, the
Intel® oneAPI DPC++/C++ Compiler
creates another instance of the same ROM. There is no explicit annotation for private variables that the
Intel® oneAPI DPC++/C++ Compiler
implements in on-chip block ROM.
Implementation of Private Memory Using Registers
Implemented using registers of the following size:
  • <X>
    registers of width
    <Y>
    bits and depth
    <Z>
    .
    • Depth was increased by a factor of
      <N>
      due to a loop initiation interval of
      <M>
      .
    • Each register is implemented in a RAM-based FIFO and consumes
      <U>
      RAMs.
  • ...
Reports that the
Intel® oneAPI DPC++/C++ Compiler
implements a private variable in registers. The
Intel® oneAPI DPC++/C++ Compiler
might implement a private variable in many registers. This message provides a list of the registers with their specific widths and depths.
Implementation of Private Memory Using Shift Registers
Implemented as a shift register with
<N>
or fewer tap points. This is a very efficient storage type.
Implemented using registers of the following sizes:
  • <X>
    register(s) of width
    <Y>
    bits and depth
    <Z>
    .
    • Depth was increased by a factor of
      <N>
      due to a loop initiation interval of
      <M>
      .
    • Each register is implemented in a RAM-based FIFO and consumes
      <U>
      RAMs.
  • ...
Reports that the
Intel® oneAPI DPC++/C++ Compiler
implements a private variable in shift registers. This message provides a list of shift registers with their specific widths and depths.
The
Intel® oneAPI DPC++/C++ Compiler
might break a single array into several smaller shift registers depending on its tap points.
The compiler might overestimate the number of tap points.
Implementation of Private Memory Using Barrel Shifters with Registers
Implemented as a barrel shifter with registers due to dynamic indexing. This is a high overhead storage type. If possible, change to compile-time known indexing. The area cost of accessing this variable is shown on the lines where the accesses occur.
Implemented using registers of the following size:
  • <X>
    registers of width
    <Y>
    bits and depth
    <Z>
    .
    • Depth was increased by a factor of
      <N>
      due to a loop initiation interval of
      <M>
      .
    • Each register is implemented in a RAM-based FIFO and consumes
      <U>
      RAMs.
  • ...
Reports that the
Intel® oneAPI DPC++/C++ Compiler
implements a private variable in a barrel shifter with registers because of dynamic indexing.
This row in the report does not specify the full area use of the private variable. The report shows additional area use information on the lines where the variable is accessed.
  • The Area Analysis of System report annotates memory information on the line of code that declares or uses private memory, depending on its implementation.
  • When the
    Intel® oneAPI DPC++/C++ Compiler
    implements private memory in on-chip block RAM, the Area Analysis of System report displays relevant local-memory-specific messages to private memory systems.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804