Forum Jump

Select Group :
Select Forum :
Sorted By :
Sort Order :
From The :
 
Thread Tools  Search this thread 
drd
Total Points:
360
Status Points:
0
Green Belt
October 13, 2008 6:50 PM PDT
10.1.017 vectorization confusion...

So I'm pretty happy with the vectorization results I'm getting... on Windows.  On OS X however, with the same code base, and what I believe are equivalent compiler options, it's a whole nother beast.

OK, granted, I'm compiling on a different OS, for a different subset of CPUs, with a different version of the compiler, but I thought the whole point of writing vectorizable code was so that I could avoid handcoding assembly for all these different situations.  Am I wrong (delusional?) to expect similar results from both compilers on the same code base?

It's doing a great job of fusing some consecutive inlined loops (in places) and vectorizing those, but it's completely ingoring that vast majority of loops which vectorize under Windows.

For example... even the following tinker toy does not vectorize with OS X (10.1.017)...

void test (float* in, float* out, int len)
{

// almost certainly unecessary, but just for kicks...
__assume_align(in, 16);
__assume_align(out, 16);

int i, local_length = len;

#pragma ivdep
for (i = 0; i < local_length; i++)
{
out[i] = in[i] + (float)0.5f;  // how explicit can I be?  you're not complaining about the type anyway.

}

}

produces: unsupported loop structure...

huh? really? are you serious?

Please tell me there's something wrong with my command line...

Windows cl:

/c /O3 /Og /Ob2 /Oi /Ot /Oy /Qipo /GA /I "../../../../QuickTime/Libraries" /I "../../../../vstsdk" /I "../../../../juce/trunk" /I "../../../../juce/trunk/src" /I "../../../../juce/trunk/extras/audio plugins/wrapper" /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /D "MYDEFINE" /D "_MBCS" /D "_VC80_UPGRADE=0x0700" /GF /FD /EHsc /MT /Zp16 /GS- /Fo".\Release/" /W3 /nologo /Qftz /Qfp-speculationfast /Qparallel /Qrestrict /Oa /QaxWOTS /Qvec-report3

OS X cl:

CompileICC /Users/me/Code/myThang/build/myProject/build/myProject.build/Release/myProject.build/Objects-normal/i386/MyFile.o /Users/me/Code/myThang/build/myProject/../../src/MyDir/MyFile.cpp
cd /Users/me/Code/myThang/build/myProject
/usr/bin/icc-10.1-base/bin/icc -x c++ -arch i386 -dev-usr-root=/Developer/usr -O3 -w1 -fomit-frame-pointer -prefetch- -ip -inline-level=2 -parallel -DMYDEFINE -D__ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__=1040 -gcc -restrict -vec_report3 -alias-args- -no-alias-const -fno-alias -fvisibility=hidden -ftz -fp-speculationfast -check-uninit -fpascal-strings -I/Users/me/Code/myThang/build/myProject/build/myProject.build/Release/myProject.build/My-Thang-txt.hmap -F/Users/me/Code/myThang/build/myProject/build/Release -F/Library/Frameworks -iquote/Library/Frameworks/Intel_IPP.framework/Versions/5.3.4.075/ia32/include -iquote.. -iquote../../build/myProject/build/Release/My-Thang.app.dSYM/Contents/Resources -iquote../../build/myProject/build/myProject.build/Debug/myProject.build/Objects-normal -iquote../../build/myProject/build/myProject.build/Release/myProject.build/Objects-normal -iquote../../build/myProject/build/Release/My-Thang.app.dSYM/Contents/Resources/DWARF -iquote../../build/myProject/build/myProject.build/Debug/myProject.build/Objects-normal/i386 -iquote../../build/myProject/build/myProject.build/Release/myProject.build/Objects-normal/i386 -I/Users/me/Code/myThang/build/myProject/build/Release/include -I/Developer/SDKs/MacOSX10.5.sdk/Library/Frameworks -I../../../juce/trunk -ipo -fast -axSTP -include myProject_Prefix.pch -c /Users/me/Code/myThang/build/myProject/../../src/MyDir/MyFile.cpp -o /Users/me/Code/myThang/build/myProject/build/myProject.build/Release/myProject.build/Objects-normal/i386/MyFile.o

...note that the redundancy in this cl was only picked up along the road to futility.




Anyway, A little humor for the day... (OS X ICC SAYS:)

ipo-2: warning #11043: unresolved _GSS_C_NT_HOSTBASED_SERVICE
Referenced in /usr/lib/libcups.2.dylib
ipo-2: warning #11043: unresolved _gss_delete_sec_context
Referenced in /usr/lib/libcups.2.dylib
ipo-2: warning #11043: unresolved _gss_import_name
Referenced in /usr/lib/libcups.2.dylib
ipo-2: warning #11043: unresolved _gss_init_sec_context
Referenced in /usr/lib/libcups.2.dylib
ipo-2: warning #11043: unresolved _gss_release_buffer
Referenced in /usr/lib/libcups.2.dylib
ipo-2: warning #11043: unresolved _gss_release_name
Referenced in /usr/lib/libcups.2.dylib


It seems, through some inevitable xcode header / framework search path anomaly, "somebody" has confused Intel Performance Primatives with Internet Printing Protocol (Google has the same problem).

I can almost hear Moe saying "IPP?  Has anyone seen IPP?  How about IPO-PO?"

tim18
Total Points:
68,747
Status Points:
68,747
Black Belt
October 14, 2008 7:18 PM PDT
Rate
 
#1

Among your Windows options, you have set /Oa "treat argument pointers as not aliased," and various options requesting the compiler to generate both vector and non-vector code paths.  In your icc options, you include -restrict, but that doesn't have any effect without the restrict qualifiers in the argument list.  As you have specified the 32-bit compiler,  SSE2 vectorization wasn't yet set as default in the version you quote (use e.g. -xW).  You have a bunch of contradictory alias options; why not simply use -ansi-alias (assert that source code doesn't violate C and C++ standards on aliasing)?

void test (float* restrict in, float* restrict out, int len)





Intel Software Network Forums Statistics

8474 users have contributed to 31606 threads and 100656 posts to date.
In the past 24 hours, we have 30 new thread(s) 109 new posts(s), and 163 new user(s).

In the past 3 days, the most popular thread for everyone has been gemm(A,A,A) like possible? The most posts were made to gemm(A,A,A) like possible? The post with the most views is Dear Steve, excuse me for a d

Please welcome our newest member Kevin Johnson