Examine Data Transfers for Modeled Regions
Accuracy Level
Medium
Enabled Analyses
Survey with in-depth static analysis + Trip Counts and FLOP with callstacks, basic data transfer simulation, and GPU memory subsystem simulation (Characterization) + Modeling
Result Interpretation
After running the
Offload Modeling
perspective with
Medium
accuracy, you get an extended
Offload Modeling
report, which shows you in additions to the basic data:
- More accurate estimations of traffic and time for all cache and memory levels.
- Measured data transfer and estimated data transfer between host and device memory.
- Total data for the loop/function from different callees.
Offload Modeling
perspective assumes a loop is parallel if its dependency type is unknown. It means that there is no information about a loop from a compiler or the loop is not explicitly marked as parallel, for example, with a programming model (OpenMP*, Data Parallel C++,
Intel® oneAPI Threading Building Blocks
).
- If you already have a report generated for a lower accuracy, all offload recommendations, metrics, and speed-up will be updated to be more precise taking into account new data.
- TheMediumaccuracy level for theOffload Modelingperspective provides sufficient information about memory and cache usage and taxes of your offloaded application.

In the
Accelerated Regions
tab of the
Offload Modeling
report, review the metrics about memory usage and data transfers:
- In the metrics table:
- In theTaxescolumn of theEstimated Bound-bygroup, review a full picture of time taxes paid for offloading to a target platform.
- In theEstimated Data Transfercolumn, review the amount of data read by and written to a target platform if code is offloaded.
- In theMemory Estimatescolumn group, see how well your application uses resources of all memory levels. Expand the group to see more detailed and accurate metrics for different memory levels.
- Select a code region from the table and review the details about amount of data transferred between host and device memory in theData Transfer Estimationspane.
- See the total amount of data transferred in each direction and the corresponding offload taxes.
- See hints about optimizing data transfers in the selected code region.
For details about metrics reported, see
Accelerator Metrics.
Next Steps
- Based on collected data, rewrite your code to offload to a target platform and measure performance of GPU kernels withGPU Roofline Insightsperspective.
- Consider running theOffload Modelingperspective with a higher level of accuracy to get more precise offload recommendations.