| January 25, 2009 11:00 PM PST | |
Use the options built into the Intel® C++ Compiler for Linux to improve performance on the Itanium® processor. The Intel compilers have robust feature sets to support optimization at all levels.
Use the appropriate compiler switches to optimize code, control compilation, and provide reporting. The compiler has four primary types of switches: optimization-level switches, interprocedural and floating-point optimization switches, aliasing switches, and report switches.
- Optimization Level Switches: Decreasing the number of instructions that are dynamically executed and replacing instructions with faster equivalents are perhaps the two most obvious ways to improve performance. Many traditional compiler optimizations fall into this category: copy and constant propagation, common subexpression elimination, dead-code elimination, peephole optimizations, function inlining, tail-call elimination, etc. ecc provides a rich variety of such optimizations. Three different levels of optimization can be requested - the higher the optimization level, the longer the compile time.
|
Switch |
Description |
Default |
|
-O |
Same as -O2 on Itanium®-based systems. |
OFF |
|
-O0 |
Same as -O2 on Itanium-based systems. |
OFF |
|
-O1 |
Enables optimizations. Optimizes for speed. For Itanium compiler, -O1 turns off software pipelining to reduce code size. |
ON |
|
-O2 |
Same as -O on Itanium-based systems. |
OFF |
|
-O3 |
Enables -O2 plus more aggressive optimizations that may increase the compilation time. Impact on performance is application-dependent; some applications may not see a performance improvement. |
OFF |
-
Note that the -O3 switch will prefetch arrays only for unit-stride accesses. For non-unit-stride accesses, significant performance gains are possible through manual prefetching using the lfetch intrinsic.
- Interprocedural and Floating-Point Optimization Switches Interprocedural optimizations allow the compiler to analyze and transform code across function boundaries, even if the functions are in separate source files. This gives the compiler a more complete view of the program, allowing it to produce better code.
|
Switch |
Description |
Default |
|
-ip |
Enables interprocedural optimizations for single-file compilation |
OFF |
|
-Obn |
Controls the compiler's inline expansion. The amount of inline expansion performed varies with the value of n as follows:
|
OFF |
|
-ipo |
Enables interprocedural optimizations across files. |
OFF |
|
-IP_fma |
Enables the combining of floating-point multiplies and add/subtract operations. |
OFF |
|
-IPF_fltacc |
Enables optimizations that affect floating-point accuracy |
OFF |
|
-IPF_fp_speculation mode |
Enables floating-point speculations with the following mode conditions:
|
OFF |
|
-ivdep_parallel |
Indicates there is n loop-carried memory dependency in the loop where the IVDEP directive is specified. |
OFF |
-
Aliasing Switches: Two variable expressions are aliases if they refer to the same memory location. In C, aliasing can arise primarily from the use of pointers. Alias detection, in general, is difficult. Without accurate alias information, the compiler must be conservative and not apply optimizations like software pipelining. Thus, the options below can greatly aid the compiler and permit optimizing transformations to be made that would otherwise be impossible.
If the -restrict switch, is used, a pointer may be declared thusly: int *restrict pi;. The declaration asserts that *pi cannot have an alias within its declared scope.
|
Switch |
Description |
Default |
|
-ansi_alias |
Directs the compiler to assume the following:
|
OFF |
|
-fno-alias |
Assume no aliasing in program. |
OFF |
|
-fno-fnalias |
Assume no aliasing within functions, but assume aliasing across calls. |
OFF |
|
-restrict |
Enables pointer disambiguation with the restrict qualifier. |
OFF |
- Report Switches: ecc can produce detailed optimization reports containing information on average instructions per cycle (IPC), instruction counts, loop transformations (software pipelining, distribution, etc.), strength reductions, predication, if-conversions, function inlining, and register allocation. The level of detail and scope can be spec ified via arguments. These reports can be used to identify code that may need to be hand-optimized (e.g., using pragmas, alias switches, and intrinsics).
|
Switch |
Description |
Default |
|
-opt_report |
Generates an optimization report directed to stderr, unless -opt_report_file is specified. |
OFF |
|
-Opt_report_filefilename |
Specifies the filename for the optimization report. It is not necessary to invoke -opt_report when this option is specified. |
OFF |
|
-opt_report_levellevel |
Specifies the verbosity level of the output. Valid level arguments:
|
OFF |
|
-opt_report_phasename |
Specifies the compilation name for which reports are generated. The option can be used multiple times in the same compilation to get output from multiple phases. Valid name arguments:
|
OFF |
|
-opt_report_routinesubstring |
Specifies a routine substring. Reports from all routines with names that include substring as part of the name are generated. By default, reports for all routines are generated. |
OFF |
|
-opt_report_help |
Displays all possible settings for -opt_report_phase. No compilation is performed. |
OFF |
Directives and Pragmas and Switches Oh My!
For more complete information about compiler optimizations, see our Optimization Notice.

