11.0.075 Win32 produces SSE2 instructions with -QxSSE

11.0.075 Win32 produces SSE2 instructions with -QxSSE

Hi,

I'm compiling my application with -QxSSE -GL, since I have users that have non-SSE2 capable machines. I just got a minidump from such a user, and the compiler has issued a 'movsd xmm, mem' instruction. The subroutine deals only with floats, but does have some SSE intrinsics.

As far as I can tell, the code which causes the problems is:

mem[2] = _mm_setr_ps(_mem[8], _mem[9], 0, 0);
den[2] = _mm_setr_ps(_den[8], _den[9], 0, 0);

mem[] and den[] are __m128, while _mem and _den are float *.
The compiler cleverly restructures each line into a single movsd (for _mem[8], _mem[9]) followed by xorps (for the 0, 0) and movlhps (to merge the two). Problem is movsd is a SSE2 command, which -QxSSE should have disabled. As far as I can see, this is the only SSE2 command used.

If I remove '-GL', the problem goes away, but so does some of the performance, and the users non non-SSE2 capable processors are the ones that need the optimizations the most.

Is there a workaround I can apply to tell the compiler that SSE is ok, but SSE2 really isn't, no matter how fancy it is?

Apologies if this is fixed in 11.1.048; I keep getting linker errors about symbol files with that release, so I've had to stay on 11.0 for now.

6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Since ICL 11.0, the only option which doesn't generate SSE2 is /arch:ia32. 10.0 had a -QxK option for SSE, but it wasn't reliable in library support.

Quoting - tim18
Since ICL 11.0, the only option which doesn't generate SSE2 is /arch:ia32. 10.0 had a -QxK option for SSE, but it wasn't reliable in library support.

Tim is right. Please use /arch:ia32.

Apologies if this is fixed in 11.1.048; I keep getting linker errors about symbol files with that release, so I've had to stay on 11.0 for now.

do mean the .sbr file issue below? If so, it's being fixed as we speak.

BSCMAKE: error BK1506 : cannot open file 'C:Dev_build_intDSPTestRelDebugDspFilter.sbr': No such file or directory

Jennifer

Quoting - Jennifer Jiang (Intel)

Tim is right. Please use /arch:ia32.

Apologies if this is fixed in 11.1.048; I keep getting linker errors about symbol files with that release, so I've had to stay on 11.0 for now.

do mean the .sbr file issue below? If so, it's being fixed as we speak.

BSCMAKE: error BK1506 : cannot open file 'C:Dev_build_intDSPTestRelDebugDspFilter.sbr': No such file or directory

Jennifer

No, in 11.1.048 I'm seeing

mumble_pch.obj : fatal error LNK1318: Unexpected PDB error; RPC (23) '(0x000006BA)'

The same code compiles without any problems using 11.0.075.. Apart from the unwanted SSE2 code, that is.

I can't use 10.1 for this, as that gives me missing vtable symbols in declspec(dllimport)ed C++ classes.

Right now it looks like I'll have to split out my performance critical code into a DLL, without any external C++ classes, compile that with 10.1 -QxK, and compile the rest with 11.0 with -arch:ia32 .. That is more than a little bit messy though, and I'd really like to avoid it if possible. Compiling all the code with -arch:ia32 isn't an option, as I need the vectorized speedup of the performance critical parts to be able to run in realtime on the non-SSE2 processors.

Quoting - thorvald.natvig
No, in 11.1.048 I'm seeing

mumble_pch.obj : fatal error LNK1318: Unexpected PDB error; RPC (23) '(0x000006BA)'

This issue was reported before but got fixed. I verified the original testcase, it is indeed fixed.

so this maybe caused by a different scenario. Is it possible for you to send me more info or a testcase?

Thanks,
Jennifer

Quoting - Jennifer Jiang (Intel)

This issue was reported before but got fixed. I verified the original testcase, it is indeed fixed.

so this maybe caused by a different scenario. Is it possible for you to send me more info or a testcase?

Thanks,
Jennifer

I haven't been able to create any minimal testcase for this; it happens when I link my application, but doesn't happen on smaller tests.
I'll test some more and see if I can narrow it down a bit, and if so I'll post a followup here.

Leave a Comment

Please sign in to add a comment. Not a member? Join today