Superlinearity Is Impossible; We Just Don't Always Think Correctly

I just received the April 2008 edition of Dr. Dobb's Journal (in my physical mailbox), and in this issue Herb Sutter continues his discussion of super linearity (where parallelizing a function produces a greater performance improvement than the number of applied processing cores). Here's Herb's article: "Super Linearity and the Bigger Machine".

I've thought a lot about Herb's last article, which I commented on in my "Can Parallelism Achieve Superlinear Performance Gains" post. In that article, Herb used the example of searching for data in a situation where the data is clumped as representing a case where superlinear performance is possible when an application is multithreaded.

My further thoughts about this situation brought me to the conclusion that a multithreaded search compared with the single-threaded search Herb posited would indeed achieve apparent superlinear performance. But, that is because the sensible approach for searching for clumped data would not be an element-by-element linear search through an array of data. In other words, the superlinearity would be achieved only because the proposed single-threaded methodology was not the ideal approach to the problem.

What if you lost your ring in a loaf of bread?

Think about it this way. You are baking a loaf of bread. You suddenly realize that your favorite ring is no longer on your finger! You search all over the counters. Finally, you conclude your ring must be inside that loaf of bread that's baking in the oven.

So, you wait until the bread is fully baked, then begin your search for the ring. Now, your ring is a relatively large object that is located somewhere inside that loaf of bread. It's "clumped data" within the background of solid bread. So, if your objective is to find that ring as quickly as possible, what will your approach be? Will you start from one end of the loaf, and cut almost infinitely tiny slices until you reach the ring? Or, would common sense suggest that you chop the loaf into slices that are more ring-sized? Since all you have to do is touch the ring, and you'll have found it.

The single-threaded method of starting at one end of the loaf and slicing tiny slices will clearly not be the most efficient method. The parallel method of chopping the loaf into eighths so you, your family, and several friends can each work with 1/8th of the loaf to try to find the ring will in most cases attain a greater than 8 times efficiency than the tiny-slices-starting-at-one-end method.

But would this really be superlinear performance? I think not. Because you yourself could have (should have) cut the loaf into eighths, then into 16ths, etc. That would have been the most efficient single-threaded method of finding that clumped data (i.e., your lost ring).


Superlinear performance is impossible. However, when we're programming, our brains don't always think of strategies that would be common sense if we were facing a "real world" problem (like finding a relatively large hard object inside a loaf of bread). Hence, parallelism will sometimes appear to produce superlinear results. But if we apply the most efficient method to the problem in the first place, the illusion of superlinear performance will never appear.

Kevin Farnham, O'Reilly Media TBB Open Source Community, Freenode IRC #tbb, TBB Mailing Lists

Download TBB
Einzelheiten zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.