I have a serial program for time-domain simulation which I'm parallelizing incrementally. I want to compare the performance of the serial and the parallel implementation. The system I use for the simulations is shared, so, when I run the serial program several times on the same data (through vtune), I get different elapsed times. The same with the parallel implementation.
How can I compare the two implementations? Should I compare the cpu time (which seems to be the same at every run)? If yes, the cpu time of the parallel program, should be divided by the value from the cpu usage diagram to be fare? (which is around 1.55)
Any citations towards some scientific way to compare the two (or even some keywords) will be much appreciated!
Thanks in advanced,