IA-32 and Intel®64 Processor Targeting Overview

IA-32 and Intel®64 Processor Targeting Overview

The compiler documentation lists many options for optimizing for particular processors or processor families. Some of these are duplicates or older options that are maintained for reasons of compatibility with other or older compilers, which can be confusing. This article tries to summarize the relationships between different switches, and explain which are the most important and useful in practice.

There are two main categories: the first is of microarchitecture-related switches that generate code that runs fast on some processors or processor families, but does not run at all on others. These typically make use of additional instruction sets that are not available on all processors. This is much the most important category. The second category is of tuning switches: these may also generate code that runs faster on some processors, but the code will run successfully on all processors. They typically involve more subtle scheduling decisions and do not invoke additional instruction sets. An example might be for multiplication by a power of 2. On some processors, an integer multiply may be best; on others, a shift operation might be faster, but all processors support both types of instruction. By default, the compiler typically tunes for a blend of recent Intel processors.

Category 1a: microarchitecture-specific

/arch:… (-m… or -arch) Optimizes for both Intel and compatible
non-Intel processors that support the specified
instruction set. On other processors, may result
in an illegal instruction error at runtime.

/Qx… (-x…) Optimizes for Intel processors that support the specified instruction set.
On other processors, including all non-Intel processors, gives a runtime error
explaining that the executable was not built to run on this processor.
Performs additional optimizations that are not performed by the corresponding /arch or -m switch.

-march… Optimizes for some limited combinations of
processors and instruction sets. Not recommended.

Category 1b: fat binaries (microarchitecture-specific code, but also an alternative default code path that should work for most or all processors).

/Qax… (-ax…) Generates one default code path, optimized for any Intel or compatible non-Intel processor that supports SSE2 instructions, and an additional code path (or paths) for Intel processors only that supports the corresponding instruction set(s). The default code path can be modified using the switches in category 1a.

Category 2: tuning only, no extended instruction sets

/tune:… -tune… (Fortran only) tuning switch kept for makefile compatibility.
Does not currently influence generated code.
/G… (-mtune…, -mcpu…) tuning switch kept for makefile compatibility.
Does not currently influence generated code.
See the main compiler documentation for the possible arguments taken by all of the above switches.

Recommendations:
The recommended processor specific switch to optimize for a specific Intel processor is /Qx… (-x…). The recommended processor specific switch to optimize for a specific Intel or compatible non-Intel processor is /arch:… (-m…).
To generate optimized code paths for one or more specific Intel processors, in addition to a default optimized code path for an Intel or compatible non-Intel processor, try the switch /Qax… (-ax…). The properties of the default code path can be modified by using the /Qx… (-x…) or /arch:… (-m…) switch in conjunction with the /Qax… (-ax…) switch.
The use of category 2 “tuning only” switches is not recommended with the current generation of compilers.

Further Information on switches from categories 1a and 1b :
/en-us/articles/performance-tools-for-software-developers-intel-compiler-options-for-sse-generation-and-processor-specific-optimizations

Optimization Notice in English

Пожалуйста, обратитесь к странице Уведомление об оптимизации для более подробной информации относительно производительности и оптимизации в программных продуктах компании Intel.