User Guide

Contents

Why Not Offloaded: Less or Equally Profitable Than Children/Parent Offloads

Symptoms

A code region has
Less or equally profitable than children offloads
or
Less or equally profitable than parent offload
as a reason why it is not offloaded.

Cause and Solution

In the commands below, replace
<APM>
with
$APM
on Linux* OS or
%APM%
on Windows* OS.
Message
Details and Cause
Solution
Less or equally profitable than children offloads
Offloading child loops/functions of this code region is more profitable than offloading the whole region with all its children. This means that the
Estimated Time on Target Device (+Host)
for the region of interest is greater than or equal to the sum of
Estimated Time on Target Device (+Host)
of its child regions profitable for offloading.
See the following metrics to identify a specific reason that prevents offloading:
  • Total execution time metrics reported in the
    Offload Information
    column group
  • Taxes in the
    Overhead
    column group
  • Information about trip counts in the
    Trip Counts
    column group
  • Dependencies in the
    Dependency Type
    column of the
    Loop/Function
    column group
Solution 1.
Disable analyzing child loops of all region heads using the
--no-model-children
option with
analyze.py
. With this option, the Offload Advisor only considers the region heads for potential offloading.
Solution 2.
You can tell Offload Advisor to model offloading for only specific code regions even if they are not profitable.
Rerun the performance modeling with
--select-loops
to specify loops of interest and
--enforce-offloads
to make sure all of them are offloaded. For example:
advixe-python <APM>/analyze.py <project-dir> --select-loops=[<file-name1>:<line-number1>,<file-name1>:<line-number2>,<file-name2>:<line-number3>] --enforce-offloads
Less or equally profitable than parent offload
Offloading a whole parent code region of the region of interest is more profitable than offloading any of its child regions separately. This means that the
Estimated Time on Target Device (+Host)
for the region of interest is greater than or equal to the
Estimated Time on Target Device (+Host)
of its parent region.
Offloading a child code region might be limited by high offload taxes.
You can tell Offload Advisor to model offloading for only specific code regions even if they are not profitable.
Rerun the performance modeling with
--select-loops
to specify loops of interest and
--enforce-offloads
to make sure all of them are offloaded. For example:
advixe-python <APM>/analyze.py <project-dir> --select-loops=[<file-name1>:<line-number1>,<file-name1>:<line-number2>,<file-name2>:<line-number3>] --enforce-offloads

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804