Developer Guide and Reference

Contents

High-Level Optimization (HLO)

High-level Optimizations (HLO) exploit the properties of source code constructs (for example, loops and arrays) in applications developed in high-level programming languages. While the default optimization level, option
O2
, performs some high-level optimizations, specifying the
O3
option provides the best chance for performing loop transformations to optimize memory accesses.
Loop optimizations may result in calls to library routines that can result in additional performance gain on Intel® microprocessors than on non-Intel microprocessors.
The optimizations performed can also be affected by certain options, such as
/arch
(Windows*),
-m
(Linux*
and
macOS*
), or
[Q]x
options.
Additional HLO transformations may be performed for Intel® microprocessors than for non-Intel microprocessors.
Within HLO, loop transformation techniques include:
  • Loop Permutation or Interchange
  • Loop Distribution
  • Loop Fusion
  • Loop Unrolling
  • Data Prefetching
  • Scalar Replacement
  • Unroll and Jam
  • Loop Blocking or Tiling
  • Partial-Sum Optimization
  • Predicate Optimization
  • Loop Reversal
  • Profile-Guided Loop Unrolling
  • Loop Peeling
  • Data Transformation: Malloc Combining and Memset Combining, Memory Layout Change
  • Loop Rerolling
  • Memset and Memcpy Recognition
  • Statement Sinking for Creating Perfect Loopnests
  • Multiversioning: Checks include Dependency of Memory References, and Trip Counts
  • Loop Collapsing