Don't worry about sublinear speedup, be happy!

Beginning programmers often expect linear speedup in parallel programs, thinking two cores should be twice as fast as one. Of course, this is usually not the case. Limitations such as serial portions, bus bandwidth, cache, communication etc. start to kick in.

For Amdahl's law kicks in, the answer is the Gustafson-Basis Law's_Law - more cores let you handle a large problem. Bus bandwidth, cache limitations, and communication limits can be helped with by improving decompositions. But ultimately most programs do reach a limit, and unless the first limit reached is CPU speed, the speedup will be sublinear in the number of processors (P).

But here's the "be happy" part. If the CPU vendors kept on building more and more complex (and power hungry) serial processors, we would need a lot more power. The relationship is cubic. See, for example, the top slide on page 2 of . That requires an 8x power to achieve 2x speedup, and 64x power to get 4x speedup.

In contrast, power for multi-core is proportional to the number of cores (modulo cache power). So if you get "only" 2x speedup on 4 cores, or "only" 4x speedup on 16 cores, you areway ahead of the alternative history where serial processors continued to roam the earth.

Thus sqrt(P) speedup is not bad. Anything more is a free bonus.

For more complete information about compiler optimizations, see our Optimization Notice.


Sergey Kostrov's picture

>>...more cores let you handle a large problem...

Smart optimization techniques could also do some magic! Many modern software developers simply do not care about optimization of software codes.

anonymous's picture

Intel will remain exccelen for ever!
Nasser,Tehran,{ms degree,physics,teacher,}

anonymous's picture

very good

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.