Forum Jump

Select Group :
Select Forum :
Sorted By :
Sort Order :
From The :
 
Thread Tools  Search this thread 
dpeterc
Total Points:
635
Status Points:
135
Brown Belt
September 5, 2008 3:21 PM PDT
huge c++ binaries compared to gcc
I work on some C and C++ based projects, and use gcc and icc in alternation for quality reasons.
I use rougly the same level of optimization with gcc and icc.
On C based programs, the binary size generated by icc and gcc is quite comparable, +- 10 %, makes sense.
On the C++ based program, icc does much worse.
Program consists of some 55.000 lines of C++ code, according to sloccount, and is Qt based.
Some rough figures of binary size:
-O3 -s GCC: 1.75 MB ICC: 2.8 MB
-O2 -s GCC: 1.65 MB ICC: 2.7 MB
-O1 -s GCC: 1.6 MB   ICC: 1.9 MB
With icc, I do not use any special optimization like -ipo -parallel -xT, if I do, it gets even worse.

While speed is important, the speed gain of icc are in the range of 10-15% and do not justify such increase in binary size. The main thing that puzzles me, is why icc's C++ binary size is so much worse than C binary, in comparison to gcc.
Any ideas or suggestions?

I use the following:
OpenSUSE 10.3
icc (ICC) 10.1 20080602
gcc (GCC) 4.2.1
tim18
Total Points:
66,417
Status Points:
66,417
Black Belt
September 5, 2008 5:35 PM PDT
Rate
 
#1

icc -O1 -fp-model source is roughly equivalent to gcc -O3 -ffast-math (in your version of gcc, where you don't get auto-vectorization without asking); if you are satisfied with the speed of that, it's hard for me to get excited about an 8% code size increase. I don't see how you could have got such a code size increase from icc without vectorization, unless you have something unusual going on with -ip, which you might suppress with -fno-inline-functions or reduced in-lining limits.

If you get significant advantage by adding auto-vectorization to gcc, which is implied by icc -O2 and -O3, without the code size increase, I might understand your complaint.   I just filed an issue about an extra dead vector code version.  Where your loops use exclusively aligned data, you can reduce the vector code expansion by #pragma vector aligned.

icpc normally inlines templates in cases where g++ economizes by using a single version invoked by multiple functions;  I'm not certain if inlining limits would control that, beyond what you did with -O1.



dpeterc
Total Points:
635
Status Points:
135
Brown Belt
September 7, 2008 5:42 PM PDT
Rate
 
#2 Reply to #1
Thanks for the suggestions, Tim.
I have tried to add  -ftree-vectorize to the gcc compilation with -O3, but it did not make and significant change in the code size. It seems that auto vectorization is already enabled with -O3 on gcc.
http://gcc.gnu.org/projects/tree-ssa/vectorization.html

The other tip regarding template inlining was quite frutitful, by setting
-inline-level=0
I could reduce the icc's -O3 binary size down to 2.2 MB, which is easier to swallow.

The C++ binary is still big compared to C. One of my other projects has 100.000 lines of C code, and yet it only makes a 1.4 MB binary, compiled with -O3, both with ICC as with GCC. So C++ base binary of comparable source size and optimization levels will  still be roughly 3 times bigger than C. I wonder if other people have the same experience, or I am doing somethign wrong.

Dušan Peterc
http://www.arahne.si


tim18
Total Points:
66,417
Status Points:
66,417
Black Belt
September 7, 2008 9:53 PM PDT
Rate
 
#3 Reply to #2
gcc didn't add vectorization as part of -O3 until 4.3.  It's still not nearly as aggressive in vectorization as icc, even if -ffast-math is set.  The gcc -ftree-vectorizer-verbose option resembles icc -vec-report, except that the default is no report.  Even when the same loops are vectorized, icc usually generates over twice as much additional code with vectorization as gcc, due to multiple versions for alignment, and more unrolling.  The i7 architecture removes several of the reasons for differences in the gcc and icc implementations of vectorization, but it hasn't made much difference in the compilers so far.


dpeterc
Total Points:
635
Status Points:
135
Brown Belt
September 10, 2008 6:13 AM PDT
Rate
 
#4 Reply to #3
I found the true culprit for the huge binary size: C++ exceptions.
Lucily, I do not use them in my code.
So I report the binary sizes with the new options:
-O3 -s -fno-exceptions -fno-inline GCC: 1.25 MB ICC: 1 MB
So ICC can actaully make significantly smaller C++ code than gcc, if we use
-fno-exceptions -fno-inline
even with -O3, which enables advanced optimizations.
For me, to reduce the binary size from 2.8 MB to 1 MB is a big thing.

I found this tip in the following document:
http://developer.apple.com/documentation/Performance/Conceptual/CodeFootprint/Articles/CompilerOptions.html
It is GCC and Apple specific, but the ideas and compiler options are also mostly valid for Linux and ICC.




Intel Software Network Forums Statistics

8293 users have contributed to 31244 threads and 99122 posts to date.
In the past 24 hours, we have 12 new thread(s) 15 new posts(s), and 25 new user(s).

In the past 3 days, the most popular thread for everyone has been huge pages on linux? The most posts were made to Pipeline buffer between stages? The post with the most views is Another example attached (Tr

Please welcome our newest member bwillems