/arch, /Qx, and some /Qax

/arch, /Qx, and some /Qax

Dear Forum,

There are a few issues around the compiler options /arch and /Qx. The issues are relevant to both compiler versions 11.0 and 10.1, and below I use the terminology for 11.0:

Intel compiler & optimizations help files state that /Qx may generate Intel-specialized optimizations for the processor specified, and that /arch generates optimizations for the architecture specified, that addresses both Intel & non-Intel processors. Also, /Qx and /arch are used to generate the baseline code when using /Qax. Now:

1. What is the behaviour when both /arch and /Qx are specified (e.g., /QxSSE3 /arch:SSE4.1) ?

2. Using /Qax in addition to (1), what baseline would be created? (e.g. /QxSSE3 /arch:SSE4.1 /QaxSSE4.2)?

3. In the compiler doc for 11.0, under /Qx, it is stated that the default for this option is /arch:. It looks like a mistake, I'd expect the default to be /Qx. What is the correct default value?

4. Does /Qx generate code that might NOT RUN on non-Intel processors? Or will the generated code simply run ineffectively on non-Intel processors?

Thanks,
Gil Moses.

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

The default of /arch:SSE2 is intentional. This allows running on any SSE2 processor, regardless of manufacturer.

For the 64-bit compiler, /QaxSSE4.1 will make both SSE2 paths (for any CPU) and SSE4.1 path, should the compiler decide there is sufficient difference to generate both paths. I can't make out from the docs whether the a stands for IA32 (as it did in 10.1)or SSE2 (as it does for 64-bit)in the 32-bit compiler.

Perhaps you would like to submit a documentation clarification issue on premier.intel.com, explaining what you would like to learn from the docs.

I understand that the older switches from previous compiler versions, which aren't mentioned in the docs, should be interpreted as -arch:IA32.

I stay away from multiple possibly conflicting switch combinations.I don'tunderstand their popularity.The doc appears to state specifically that the "a" in /Qax.. will be ignored when another switch is given. The switch /QxK was partly implemented, undocumented,in 10.1, but it seems buggy; it should be taken as -arch:IA32 in 11.x.

I've already opened an issue at premier, still in correspondence. The initial response I got was unclear.

In the doc it says that the default for /Qx is /arch<...>. AFAIK the default for some option A can't be a value for option B.

Gil.

Someone hoped to clear up the confusion between options which require recognition of specific Intel CPU types and those which simply target CPU capabilities, by introducing the alternative /arch options for the latter purpose. So, the default has to be /arch:SSE2 to provide satisfactory support for the widest range of targets, with no /Qx option for Intel only. This leaves the /Qax options, such as /QaxSSE4.1, to be explained as supporting both generic SSE2 CPUs (at least in the 64-bit case) and optimization for Intel SSE4 CPUs.

Supposing that you wanted to support generic SSE3, along with (Intel specific) SSE4, you could try /QxSSE4.1 /arch:SSE3

For the 32-bit compiler,the doc should explain whether the equivalent to the ICL 10.1 option /QaxS is /QxSSE4.1 /arch:IA32 or something else, and whether the 11.0 /QaxSSE4.1 is equivalent to 10.1 /QxWS or /QaxS. Avoiding this question seems to imply the combinations haven't been tested thoroughly. Clearly, there is a need for a generic option which has well tested reasonably comprehensive optimization (SSE2), and avoids the confusion of different numerical properties according to CPU type or combinations whose properties aren't adequately tested.

By hindsight, the option /QxK, which was supported well in 8.1 32-bit, hasn't been fully reliable since, and has no 11.0 equivalent, although /QxK or /QxB provided the best performance for a range of applications using primarily float data. Unfortunately, the good performance for mixed float and double is at the expense of less predictable numerical properties. With /QxB, in particular, one never knew when float expressions would be promoted to double. The Intel CPU types which depended on this option for full performance have been out of production for several years now.

Icould wait for someone on the compiler team to clarify, or... hey, why not jump in?

AFAIK, the "/arch" option appeared for compatibility with theMicrosoft compiler, and /arch:sse2 is equivalent to /QxW. Like Tim said, if you want to be safe, I'd recommend against mixing /Qx and /arch options.

I do use combinations of /Qx and /Qxa, though (I think the 'a' stands for auto-dispatch). Prior to 11.0, I would commonly use "/QxW /QaxS" to set a baseline of SSE2 and compiler-dispatched paths for any extensions up to SSE4.1 - a generally "safe" combination these days. If I wanted something that only needed to run on Core 2 or higher (potentially reducing code size), I could use "/QxT /QaxS" to set the baseline to SSSE3 and go up to SSE4.1. Note that it only makes sense to have one /Qx switch and one /Qax switch, and this is how the dialog is set up in Visual Studio.

Leave a Comment

Please sign in to add a comment. Not a member? Join today