/QaxXXX1,XXX2,XXX3... under windows?

/QaxXXX1,XXX2,XXX3... under windows?

Linux ICC allows the specification of multiple additional processor-optimized code paths

-axSSE2,SSE3,SSSE3,SSE4,SSE4.1,SSE4.2

The Windows ICC only allows one.

/QaxSSE2

Is there a reason for this?

- Oliver

Oliver 'kfs1' Smith,
Lead Server Programmer,
Cornered Rat Software / Battleground Europe
8 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I'm not entirely sure what you're getting at. The "a" in "ax" stands already for SSE2 (by default, if you don't specify what it should be). It wouldn't make sense to ask for up to 5 separate code paths, nor (in effect) to specify 2 architectures twice in the list, for the case where the compiler is able to generate different code for each case. The penalty for code size and for selecting among so many versions would outweigh any advantage of being able to optimize for each architecture.
sse4 may be accepted on linux for gcc compatibility; I believe it would be synonymous with sse4.1, but I don't think it's an acceptable choice for Windows.
SSE3 is most often useful for complex arithmetic. However, I believe a CPU which isn't recognized as Intel will get the SSE2 path, unless you designate /arch:SSE3 as the fall-back "a" choice.
SSSE3, in my experience, is useful only for operations on overlapping array sections, only on Core 2 family CPUs.
SSE4.2 is seldom any different from SSE4.1.
So it is unusual that you can't make a more useful choice of a small number of architectures.

Hi Oliver,

Can you go into a little more detail on what the error is? Is the compiler complaining at compile-time that it only allows one, or is there no multiple paths being generated in the resulting code? The 11.1.060 compiler I'm trying takes the switch syntax without complaint:

>icl -QaxSSE3,SSE4.2 -Qrestrict -c test.cpp
Intel C++ Intel 64 Compiler Professional for applications running on Intel
 64, Version 11.1    Build 20100203 Package ID: w_cproc_p_11.1.060
Copyright (C) 1985-2010 Intel Corporation.  All rights reserved.

test.cpp

>icl -QaxSSE2,SSE3,SSE4.2 -Qrestrict -c test.cpp
Intel C++ Intel 64 Compiler Professional for applications running on Intel
 64, Version 11.1    Build 20100203 Package ID: w_cproc_p_11.1.060
Copyright (C) 1985-2010 Intel Corporation.  All rights reserved.

test.cpp
Brandon Hewitt
Technical Consulting Engineer

For 1:1 technical support: http://premier.intel.com

Software Product Support info: http://www.intel.com/software/support

Brad;

I was trying to set the option from inside Visual Studio, which only appears to take one /Qax setting.

Note: I realize /arch:SSE2 and /QaxSSE2 are redundant in this case :) It does the same thing even when I haven't accidentally selected the /arch:SSE2 build type ;)

Oliver 'kfs1' Smith,
Lead Server Programmer,
Cornered Rat Software / Battleground Europe

It does look like the VS properties setting permits you to set only 2 processor-specific paths; the default level in "Intel Processor-Specific Optimization" (which is SSE2 by default) and an additional level in "Add Processor-Optimized Code Path." If you want more (should the compiler find use for that many), you can set the properties lines you show to none and put your list in the Command Line Additional Options, e.g. /QaxSSE3,SSE4.1 (which gives you 3 paths: SSE2,3,4)

Tim:

I'm not adjusting "Intel Processor-Specific Optimization", I'm adjusting "Add Processor-Optimized Code Path".

"/Qax" has no default:

1>icl: command line warning #10155: ignoring option '/Qax'; argument required

There are around 850,000 lines of source in our main code base, 1.5 million lines if you include 3rd partycomponents (not counting standard libraries, boost, stl etc ;), covering a disparate variety of domains all of which can benefit from different SSE combinations

Two years ago, just turning on SSE1 instructions, without introducing any aligned structures or anything, gave us a huge performance increase.

In our current dev branch, SSE2 didn't really do much, but there are areas of the code where it brings down execution times drastically. SSE3 etc, I'm not sure sure, SSE4.2 really helps the XML-heavy UI.

Unfortunately, most of the existing analysis tools - like vtune, parallel studio, valgrind etc, are fairly hard to use for profiling something this complex because they invariably resolve or modify the intricate timing minutae - so all I've been able to measure is that each code path has provided a different but detectable improvement in overall performance.

Oliver 'kfs1' Smith,
Lead Server Programmer,
Cornered Rat Software / Battleground Europe

From the project's property "Add Processor-Optimized Code path", you can only select one of the list items.

But there are two places to add what you'd like:
1. Tools -> Option -> Intel C++ page: there's a default setting field, you can add /QaxSSE2,SSE3,SSSE3.....
Once you added here, it will affect all the projects/solutions built with the Intel C++ Compiler.

2. select all the projects that has Intel C++ icon (use ctrl+mouce click), then goto Project Property dialog, under the C/C++ -> Command Line -> Additional Options field, add "/QaxSSE2,SSE3,SSSE3.....".
This might be the best choice, I think.

Jennifer

Thanks, that is what I've wound up doing. It might be worth adding that to the option's tooltip; the wording of the tooltip makes it very clear you can only select one: but it doesn't indicate that this is a Visual Studio limitation and not a compiler limitation.

Oliver 'kfs1' Smith,
Lead Server Programmer,
Cornered Rat Software / Battleground Europe

Leave a Comment

Please sign in to add a comment. Not a member? Join today