I'm trying to speed up our image processing software that uses some image arithmetic functions from IPPI. I first attempted to upgrade to the latest IPP (7.0) and let it do the OpenMP threading internally. That did not work. On a Core 2 duo processor (Win7) I got no speed up at all for two threads over one (although Task Manager showed that both hardware processors are pegged at 100%). I followed all the suggstions that I could find from this forum but nothing worked.
So now I've called ippSetNumThreads(1) to disable OpenMP and created two threads of my own that process either the top half of a 1280x960 image (thread 1) or the lower half (thread 2). I do this by simply giving the second thread an offset into the image and processing 960/2 or 480 lines.
This also does not work and I can't imagine why not. The total execution time on this machine for a series of arithmetic functions is about 16 msec per loop whether I use a single thread to process the full image or two threads to process each half of the image.
Can someone suggest what might be going on here?