We are using IPPI static multithreaded static libraries using dispatching. We are using the latest version of theIPPavailable as of 1-Jan-2010. We are seeing tremendous improvements over the single threaded libraries, often ~2x. We are doing no threaded image processing ourselves, just relying on the libraries. This product is really an excellent product.
Now our dilemma -- On XP, only one of the cores appears to "dance" when running our app and watching the graphic on MS's performance monitor. On our Win 7 machines, all cores "dance".
What's got me stumped is that *regardless of the OS (Win7 or XP)*, the speed of our algorithms is best predicted by the number of cores on the system. So on two identical machines, one with XP and the other with Win7, they'd pretty much be the same in timing when running our IPPI application despite what the graphics on the Microsoft Performance Monitor would suggest.
If this is the case, why do I care? Because I recently installed our application at a customer site using a suite of XEON 8 cores and the first thing the customer did was bring up that MS Performance monitor. He said why is only one core used? I had a dumb look and said I'd look into it.
I must understand this issue and very much appreciate any insights you could provide!