10.1.017 vectorization confusion...

10.1.017 vectorization confusion...

So I'm pretty happy with the vectorization results I'm getting... on Windows. On OS X however, with the same code base, and what I believe are equivalent compiler options, it's a whole nother beast.

OK, granted, I'm compiling on a different OS, for a different subset of CPUs, with a different version of the compiler, but I thought the whole point of writing vectorizable code was so that I could avoid handcoding assembly for all these different situations. Am I wrong (delusional?) to expect similar results from both compilers on the same code base?

It's doing a great job of fusing some consecutive inlined loops (in places) and vectorizing those, but it's completely ingoring that vast majority of loops which vectorize under Windows.

For example... even the following tinker toy does not vectorize with OS X (10.1.017)...

void test (float* in, float* out, int len)
{

// almost certainly unecessary, but just for kicks...
__assume_align(in, 16);
__assume_align(out, 16);

int i, local_length = len;

#pragma ivdep
for (i = 0; i < local_length; i++)
{
out[i] = in[i] + (float)0.5f; // how explicit can I be? you're not complaining about the type anyway.

}

}

produces: unsupported loop structure...

huh? really? are you serious?

Please tell me there's something wrong with my command line...

Windows cl:

/c /O3 /Og /Ob2 /Oi /Ot /Oy /Qipo /GA /I "../../../../QuickTime/Libraries" /I "../../../../vstsdk" /I "../../../../juce/trunk" /I "../../../../juce/trunk/src" /I "../../../../juce/trunk/extras/audio plugins/wrapper" /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /D "MYDEFINE" /D "_MBCS" /D "_VC80_UPGRADE=0x0700" /GF /FD /EHsc /MT /Zp16 /GS- /Fo".\Release/" /W3 /nologo /Qftz /Qfp-speculationfast /Qparallel /Qrestrict /Oa /QaxWOTS /Qvec-report3

OS X cl:

CompileICC /Users/me/Code/myThang/build/myProject/build/myProject.build/Release/myProject.build/Objects-normal/i386/MyFile.o /Users/me/Code/myThang/build/myProject/../../src/MyDir/MyFile.cpp
cd /Users/me/Code/myThang/build/myProject
/usr/bin/icc-10.1-base/bin/icc -x c++ -arch i386 -dev-usr-root=/Developer/usr -O3 -w1 -fomit-frame-pointer -prefetch- -ip -inline-level=2 -parallel -DMYDEFINE -D__ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__=1040 -gcc -restrict -vec_report3 -alias-args- -no-alias-const -fno-alias -fvisibility=hidden -ftz -fp-speculationfast -check-uninit -fpascal-strings -I/Users/me/Code/myThang/build/myProject/build/myProject.build/Release/myProject.build/My-Thang-txt.hmap -F/Users/me/Code/myThang/build/myProject/build/Release -F/Library/Frameworks -iquote/Library/Frameworks/Intel_IPP.framework/Versions/5.3.4.075/ia32/include -iquote.. -iquote../../build/myProject/build/Release/My-Thang.app.dSYM/Contents/Resources -iquote../../build/myProject/build/myProject.build/Debug/myProject.build/Objects-normal -iquote../../build/myProject/build/myProject.build/Release/myProject.build/Objects-normal -iquote../../build/myProject/build/Release/My-Thang.app.dSYM/Contents/Resources/DWARF -iquote../../build/myProject/build/myProject.build/Debug/myProject.build/Objects-normal/i386 -iquote../../build/myProject/build/myProject.build/Release/myProject.build/Objects-normal/i386 -I/Users/me/Code/myThang/build/myProject/build/Release/include -I/Developer/SDKs/MacOSX10.5.sdk/Library/Frameworks -I../../../juce/trunk -ipo -fast -axSTP -include myProject_Prefix.pch -c /Users/me/Code/myThang/build/myProject/../../src/MyDir/MyFile.cpp -o /Users/me/Code/myThang/build/myProject/build/myProject.build/Release/myProject.build/Objects-normal/i386/MyFile.o

...note that the redundancy in this cl was only picked up along the road to futility.

Anyway, A little humor for the day... (OS X ICC SAYS:)

ipo-2: warning #11043: unresolved _GSS_C_NT_HOSTBASED_SERVICE
Referenced in /usr/lib/libcups.2.dylib
ipo-2: warning #11043: unresolved _gss_delete_sec_context
Referenced in /usr/lib/libcups.2.dylib
ipo-2: warning #11043: unresolved _gss_import_name
Referenced in /usr/lib/libcups.2.dylib
ipo-2: warning #11043: unresolved _gss_init_sec_context
Referenced in /usr/lib/libcups.2.dylib
ipo-2: warning #11043: unresolved _gss_release_buffer
Referenced in /usr/lib/libcups.2.dylib
ipo-2: warning #11043: unresolved _gss_release_name
Referenced in /usr/lib/libcups.2.dylib

It seems, through some inevitable xcode header / framework search path anomaly, "somebody" has confused Intel Performance Primatives with Internet Printing Protocol (Google has the same problem).

I can almost hear Moe saying "IPP? Has anyone seen IPP? How about IPO-PO?"

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Among your Windows options, you have set /Oa "treat argument pointers as not aliased," and various options requesting the compiler to generate both vector and non-vector code paths. In your icc options, you include -restrict, but that doesn't have any effect without the restrict qualifiers in the argument list. As you have specified the 32-bit compiler, SSE2 vectorization wasn't yet set as default in the version you quote (use e.g. -xW). You have a bunch of contradictory alias options; why not simply use -ansi-alias (assert that source code doesn't violate C and C++ standards on aliasing)?

void test (float* restrict in, float* restrict out, int len)

Leave a Comment

Please sign in to add a comment. Not a member? Join today