compiling perl interpreter: icc 9.1 performs poor compared to gcc 4.1

compiling perl interpreter: icc 9.1 performs poor compared to gcc 4.1

Hi,

we use the 9.1.045 on a machine with two 5148 Woodcrest CPUs. The sole
purpose of this compiler is to compile a highly optimized
Perl-Interpreter. We have experienced good results with icc on the P4
architecture, where it gave us about 30% better results than the gcc
3.3 (which itself gave us about 15% better results than gcc 4.1.x !).

However, on the Core 2 Duo CPUs we experience a performance loss of about 12% when compiling with icc 9.1 compared to gcc 4.1.

After compiling the Perl Interpreter, we benchmark it with perlbench
(see http://search.cpan.org/dist/perlbench/), and so far we got these
results:

compiler + Options Performance in %
gcc 4.1 -O2
100
icc 9.1
-O2
86
icc 9.1 -O2 -xT
87
icc 9.1 -O2 -xT -ipo --- doesn't compile
icc 9.1 -O2 -xT -ip 92
icc 9.1
-O3
88
icc 9.1 -O2 -xT -parallel 87
icc 9.1 -O2 -xT -ip -parallel 92

It is interesting to see, that -parallel doesn't seem to give any performace advantage
all values are +/-2 % exact, so even the -O2 -xT -ip performs in best case 6% slower than the gcc 4.1

We do not need this binary to run on any other architecture. Just Core
2, but we also tried options such as -mtune=pentium4, but this tends to
generate even slower code.

All in all it appears that the icc 9.1 cannot handle the Core2 Duo
quite well yet. Anyone made similar experience? Despite the failure of
the IPO, we're trying
to generate a PGO optimized binary and will report the results later.

kind regards,
Adam Novotny - PetaMem R&D

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Why don't you set -ansi_alias? Did that not help on P4?

-ansi_alias wasn't considered in the P4 tests. On Core 2 Duo it doesn't
make any difference (in performance of the executale) whether it is set
or not. Thus it also doesn't help the icc-binary performance.

Adam

I don't know the linux equivalent of -Oa (assume no aliasing) but IMO that option alone brings the most performance when compiling with ICC. Also, I do not understand why are you avoiding -O3? Also -Qunroll and -Qip are a must.

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.

Login to leave a comment.