Confused About Perfomance

Confused About Perfomance

Hi,I am profiling my code on three separate computers. The computer that is supposed to have better specs runs 50% slower then the other computers using IPP. The core i7 930 runs the code 50% slower than the core i7 860 and core i7 920 in my hot spots. I used Amplifier to profile. The IPP functions that I used are way at the top on the 930. on the computers where the code runs faster, they don't even show up. The code is exactly the same. I don't understand? The 930 is a fresh new computer built from scratch two days ago. So it has less software installed. Mother boards are identical between the 920 and 930. but the 920 still runs 50% faster than the 930.What else should i be looking at?Thanks.

13 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.


are you concluding that your hotspot running 50% slower just from the fact that functions appears or not in Amplifier? Or you did measure actual time what it take the code to run the same workload?


I measured the time as well. about 2.5 ms on faster computers and over 5.0ms on the machine in question. I don't think the IPP is being dispatched properly but I am not sure how to debug this.

Actually that would be twice as slow not 1.5x as slow.

Do you link with IPP DLLs or threaded static libraries?

To check what cpu-cpecific code was dispatched by IPP you can print IPP version info (specifically library name, which contain cpu specific prefix)


I link with static. I will try outputing what ipp is being dispatched.

Ok, now i'm really stumped. The 920 and the 930 are dispatching the same dll's. I changed it to non static to see if it helps. It didn't. The performance is the same. the 930 still runs twice as slow and the IPP functions are bubbling up at the top of Amplifier. I am really stumped. The code is identical. What other factors should I be looking at? Should I swap CPU's and see if the 930 is faulty?

There are many factors wich influence the performance. But before going to conclusions I'd like to reproduce your result. Is it possible for you to share your test case here?



Unfortunatly I cannot. As we use some hardware. I will try to emulate it without the hardware we use.
I have tried reinstalling the os, going to raid0 and back. All the same. The i7 930 is still twice as slow. Is it a configuration isssue somewhere? I alsso tried putting in faster ram. I don't understand.
I will try to setup a test build so I can send up.


Threading influences performance.
Can you see that all cpu cores are at 100% in both cases?

My idea is, maybe your slower cpu is using less cores than it could.

Try letting the code run for several seconds, and use task manager with seperate core graphics, and see that all cores are used fully, while your code runs.

Actually, it's not using 100% in either case.... but it gets weirder... from a fresh reboot the timing is about 2.7 ms per iteration in my hotspot. Then, after about a minute it goes up to 5.5ms per iteration and stays there until I do another reboot. What is going on here? And I even rebuilt a brand-new machine with a new motherboard and using i7 950. I am really stumped. Does anyone know of something that could be conflicting with IPP? I have tried dumping the process running in both cases nothing jumps out at me. In both cases, the number of processes running are 47.I have tried this suggestion but no difference; I have added an excel book with two sheets outlining the timings. The ipp modules are the biggest difference. The first sheet is running on a 950(bad performance) the second on a 920(good performance)


Downloadapplication/zip 050vs920.zip6.42 KB

SOLVED:It is the power plan under Windows 7. I ended up reinstalling the os on one of the machines that was working at optimum speed. Well, not to my surprise after the reinstall the performance was not optimal.... It was the power plan. By default Windows 7 sets it Balanced. I changed it to HighPerformanceand presto! All is well now. At least now I can sleep. :)Thanks for all the help and suggestions in this thread.

One last bit of idea: some cpus use throttling to prevent the cpu to prevent overheating the cpu chip, when running a high cpu process for a while...
In that case, one could use cpu-z to monitor the actual cpu clock, and then see if it changes after a while.

Leave a Comment

Please sign in to add a comment. Not a member? Join today