we are developing an imaging application that uses per image a thread to correlate the image to a reference image. An image takes about 1MByte of data.
The time for correlation takes without hyperthreading 100
secs, with hyperthreading enabled 120 secs!
The application is written with MFC, is running under XP and is using native Win32 threads.
When I assign all threads to a single CPU (using SetThreadAffinityMask) the correlation takes again 100secs.
Assigning the threads alternating to both CPUs I come up again to 120secs.
BTW: When running on one CPU the taskmanager shows 100% usage for the first CPU and 0 % for the second.
When running on both CPUs the taskmanager show 100% both CPUS for 120 seconds!
And before someone asks: the correlation code is not stuffed with synchronisation objects and there are no while (do-nothing) loops.
Any idea where I can start tracking down this problem?
Thanks for all hints