What condition The ICC wins VC ?

What condition The ICC wins VC ?

Gaiger Chen's picture

I have download the IJL(Ibtel JPEG Library) for jpeg encode application.(of course with IPP)

I compile the code by VC and ICC, and benchmark them. I find the ICC and VC is EXACTLY SAME for IJL.

I have tried O2, O3 and full optimization.

The different is about +-1%.

I have also tried matrix multiply, the VC and ICC is the same performance.

So, I just want to ask, what condition I should use ICC for acceleration?

test platform :

intel duel core E6500(2.93GHz) with DDR 2

windows XP sp3
-------------------------------------------------
Visual Studio 2005 (VC 8)
Intel C++ 10.1
IPP 6.1

thank you

6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
pvonkaenel's picture

Back when I was using VC++ 6, I was able to switch to the Intel compiler, set some flags and get 25%-30% gain without any additional work. Since VS2005, I have been seeing results similar to your - the Microsoft compiler has just gotten better. Additional, I find it more difficult now to get a speed gain by reworking the code with intrinsics - again the compilers have gotten better, and are doing that on their own.

At this point I find that no matter what, I need to really put in additional effort to eck out additional performance gains, but the Intel compiler can be very helpful in that area: now I tend to use the Intel specific pragmas to help guide the compiler to make more informed decisions. I find this easier than figuring out the intrinsics myself, and it helps keep the code portable. They also have profile guided optimization, but I have not tried using that.

Peter

Thomas Willhalm (Intel)'s picture

Might it be that most of the time is spent in the library routines? If they are distributed as binaries, there won't be a difference in speed.

In my personal opinion, the most notable features of the Intel compiler are the advanced vectorizer and inter-procedural optimization (IPO). If these optimization features can solve a major problem in the code, this can result in significant performance improvements.

Jennifer J. (Intel)'s picture
Quoting pvonkaenel Back when I was using VC++ 6, I was able to switch to the Intel compiler, set some flags and get 25%-30% gain without any additional work. Since VS2005, I have been seeing results similar to your - the Microsoft compiler has just gotten better.
It is true that the VS compiler has improved.
But for matrix like calculation heavy and loopy code, the Intel C++ Compiler should still do better.

Maybe add some more specific optimization like: /QxSSE3 or /QxSSE4.1; or /arch:SSE3 or /arch:SSE4.1 and try with or without /Qparallel (parallelizing the outer loop, but vectorizing the inner loop).

Jennifer

Gaiger Chen's picture

I download the Jpeg Viewer - IPP, then I use ICC and VC to build it, notice that the Jpeg Viewer-IPP could be supported by OpenMP.
I catch a image from world of warcraft( near well in front of south bank at Dalaran, becourse a lot players gathers there for the screen is full of varius ), then encode this image 100 time.
As you see, when the OpenMP turned off, there is no different between ICC and VC; but when OpenMP on, The ICC is better obviously.

My test platform :
VC 8(VS 2005)
ICC 10.1
intel Duel core E6500(2.93GHz , L2 2M)
window XP sp3

below is my test image(1440*900).

Jennifer J. (Intel)'s picture

It seems that the serial code is well tuned. You only needs to add multi-threading.

You can use Parallel Amplifier or VTune to find out if there are more hotspots that are still in serial code.

Jennifer

Login to leave a comment.