Forum Jump

Select Group :
Select Forum :
Sorted By :
Sort Order :
From The :
 
Thread Tools  Search this thread 
dpeterc
Total Points:
675
Status Points:
175
Brown Belt
September 5, 2008 3:21 PM PDT
huge c++ binaries compared to gcc
I work on some C and C++ based projects, and use gcc and icc in alternation for quality reasons.
I use rougly the same level of optimization with gcc and icc.
On C based programs, the binary size generated by icc and gcc is quite comparable, +- 10 %, makes sense.
On the C++ based program, icc does much worse.
Program consists of some 55.000 lines of C++ code, according to sloccount, and is Qt based.
Some rough figures of binary size:
-O3 -s GCC: 1.75 MB ICC: 2.8 MB
-O2 -s GCC: 1.65 MB ICC: 2.7 MB
-O1 -s GCC: 1.6 MB   ICC: 1.9 MB
With icc, I do not use any special optimization like -ipo -parallel -xT, if I do, it gets even worse.

While speed is important, the speed gain of icc are in the range of 10-15% and do not justify such increase in binary size. The main thing that puzzles me, is why icc's C++ binary size is so much worse than C binary, in comparison to gcc.
Any ideas or suggestions?

I use the following:
OpenSUSE 10.3
icc (ICC) 10.1 20080602
gcc (GCC) 4.2.1
tim18
Total Points:
68,827
Status Points:
68,827
Black Belt
September 5, 2008 5:35 PM PDT
Rate
 
#1

icc -O1 -fp-model source is roughly equivalent to gcc -O3 -ffast-math (in your version of gcc, where you don't get auto-vectorization without asking); if you are satisfied with the speed of that, it's hard for me to get excited about an 8% code size increase. I don't see how you could have got such a code size increase from icc without vectorization, unless you have something unusual going on with -ip, which you might suppress with -fno-inline-functions or reduced in-lining limits.

If you get significant advantage by adding auto-vectorization to gcc, which is implied by icc -O2 and -O3, without the code size increase, I might understand your complaint.   I just filed an issue about an extra dead vector code version.  Where your loops use exclusively aligned data, you can reduce the vector code expansion by #pragma vector aligned.

icpc normally inlines templates in cases where g++ economizes by using a single version invoked by multiple functions;  I'm not certain if inlining limits would control that, beyond what you did with -O1.



dpeterc
Total Points:
675
Status Points:
175
Brown Belt
September 7, 2008 5:42 PM PDT
Rate
 
#2 Reply to #1
Thanks for the suggestions, Tim.
I have tried to add  -ftree-vectorize to the gcc compilation with -O3, but it did not make and significant change in the code size. It seems that auto vectorization is already enabled with -O3 on gcc.
http://gcc.gnu.org/projects/tree-ssa/vectorization.html

The other tip regarding template inlining was quite frutitful, by setting
-inline-level=0
I could reduce the icc's -O3 binary size down to 2.2 MB, which is easier to swallow.

The C++ binary is still big compared to C. One of my other projects has 100.000 lines of C code, and yet it only makes a 1.4 MB binary, compiled with -O3, both with ICC as with GCC. So C++ base binary of comparable source size and optimization levels will  still be roughly 3 times bigger than C. I wonder if other people have the same experience, or I am doing somethign wrong.

Dušan Peterc
http://www.arahne.si


tim18
Total Points:
68,827
Status Points:
68,827
Black Belt
September 7, 2008 9:53 PM PDT
Rate
 
#3 Reply to #2
gcc didn't add vectorization as part of -O3 until 4.3.  It's still not nearly as aggressive in vectorization as icc, even if -ffast-math is set.  The gcc -ftree-vectorizer-verbose option resembles icc -vec-report, except that the default is no report.  Even when the same loops are vectorized, icc usually generates over twice as much additional code with vectorization as gcc, due to multiple versions for alignment, and more unrolling.  The i7 architecture removes several of the reasons for differences in the gcc and icc implementations of vectorization, but it hasn't made much difference in the compilers so far.


dpeterc
Total Points:
675
Status Points:
175
Brown Belt
September 10, 2008 6:13 AM PDT
Rate
 
#4 Reply to #3
I found the true culprit for the huge binary size: C++ exceptions.
Lucily, I do not use them in my code.
So I report the binary sizes with the new options:
-O3 -s -fno-exceptions -fno-inline GCC: 1.25 MB ICC: 1 MB
So ICC can actaully make significantly smaller C++ code than gcc, if we use
-fno-exceptions -fno-inline
even with -O3, which enables advanced optimizations.
For me, to reduce the binary size from 2.8 MB to 1 MB is a big thing.

I found this tip in the following document:
http://developer.apple.com/documentation/Performance/Conceptual/CodeFootprint/Articles/CompilerOptions.html
It is GCC and Apple specific, but the ideas and compiler options are also mostly valid for Linux and ICC.




Intel Software Network Forums Statistics

8484 users have contributed to 31619 threads and 100691 posts to date.
In the past 24 hours, we have 34 new thread(s) 123 new posts(s), and 181 new user(s).

In the past 3 days, the most popular thread for everyone has been gemm(A,A,A) like possible? The most posts were made to gemm(A,A,A) like possible? The post with the most views is Dear Steve, excuse me for a d

Please welcome our newest member monkeybrains